Module 11.1

NumPy

NumPy is the foundation of scientific computing in Python. It provides fast, memory-efficient arrays that outperform Python lists by orders of magnitude. Every data science library - Pandas, Matplotlib, TensorFlow - builds on NumPy arrays.

50 min
Intermediate
Hands-on
What You'll Learn
  • Creating NumPy arrays
  • Array indexing and slicing
  • Broadcasting operations
  • Mathematical functions
  • Array manipulation
Contents
01

Why NumPy?

Python lists are flexible but slow for numerical operations. NumPy arrays store data contiguously in memory and use optimized C code, making them 10-100x faster for math operations. Whether you're working with sensor data, financial information, images, or anything numerical, NumPy is the tool you'll want to master first.

Real-World Impact: A simple calculation that takes 5 seconds with Python lists takes about 50 milliseconds with NumPy on a million elements - that's 100x faster!
Quick Start: Installation

NumPy comes pre-installed with most Python distributions (Anaconda, Colab, etc.). To ensure you have it:

# Install NumPy
pip install numpy

# Verify it's installed
python -c "import numpy; print(numpy.__version__)"

If you see a version number, you're ready to go!

Learning Tips
Type Along

Don't just read the code - type it into your Python environment and see the results yourself. This helps your brain form connections.

Start Simple

Before complex operations, master the basics: create arrays, access elements, understand shape. Don't jump to linear algebra yet.

Use print() and .shape

Whenever confused, print your array and its shape. This reveals 90% of bugs immediately.

Join the Community

NumPy has great documentation and Stack Overflow answers. Google your error - you're probably not the first!

Key Concept

Vectorized Operations

NumPy performs operations on entire arrays at once without Python loops. This vectorization leverages CPU optimizations and makes code both faster and more readable.

Why it matters: Processing a million numbers takes milliseconds with NumPy, but seconds with Python loops.

Speed Comparison: List vs Array

Processing 1 Million Numbers

Why NumPy arrays outperform Python lists by orders of magnitude

Python List
Slower (Baseline)

Memory Layout:

Ptr
→ 0x1A2B
Ptr
→ 0x7F5C
Ptr
→ 0x3D8E
...
Scattered
Objects scattered across memory (cache misses)

Processing:

for x in list: result.append(x*2)
Python loop overhead Type check per element Dynamic allocation Reference dereferencing
~500ms
1 million elements
NumPy Array
100x Faster!

Memory Layout:

1 2 3 4 5
Contiguous in memory (optimal cache locality)

Processing:

array * 2 # Vectorized
C-level operations (no Python overhead) No type checking required SIMD vectorization Cache-friendly memory access
~5ms
1 million elements

Python List

500ms
per operation

NumPy Array

5ms
per operation

Speed Advantage

100x Faster
List
NumPy Array
1 unit List
100 units Array
Why Lists Are Slow
  • Python interpreter overhead on each iteration
  • Type checking for every element operation
  • Objects scattered in memory (cache misses)
  • Dynamic allocation and deallocation
Why NumPy is Fast
  • Compiled C/Fortran operations (no Python overhead)
  • Contiguous memory layout (cache hits)
  • SIMD vectorization (multi-element ops)
  • Fixed data types (no checking needed)
Python List (Slow)
# Multiply each element by 2
# Must loop explicitly - Python checks type each time!
numbers = [1, 2, 3, 4, 5]
result = []
for n in numbers:
    result.append(n * 2)  # Slow: type checking per iteration
print(result)  # [2, 4, 6, 8, 10]

# Sum all elements
total = sum(numbers)  # 15
NumPy Array (Fast)
import numpy as np

# Multiply each element by 2
# No loop needed! Operation on entire array at C level
arr = np.array([1, 2, 3, 4, 5])
result = arr * 2  # Fast: one operation on entire array
print(result)  # [2 4 6 8 10]

# Sum all elements
total = arr.sum()  # 15 - also optimized!

NumPy eliminates the loop entirely. Operations like * 2 apply to all elements at once, and methods like .sum() replace Python's built-in functions.

02

Creating Arrays

There are many ways to create NumPy arrays depending on your needs. You can convert existing Python lists, generate arrays with specific patterns, create arrays filled with specific values, or generate random arrays. Let's explore each approach so you can choose the best one for your situation.

Array Creation Methods at a Glance
MethodPurposeExample
np.array()From Python listnp.array([1,2,3])
np.zeros()Array of zerosnp.zeros((3,4))
np.ones()Array of onesnp.ones((2,5))
np.arange()Sequential valuesnp.arange(0, 10, 2)
np.linspace()Evenly spacednp.linspace(0, 1, 5)
np.random.rand()Random 0-1np.random.rand(3,3)

From Lists

The simplest way to create a NumPy array is from a Python list using np.array(). This creates a copy of your data in NumPy's optimized format. Think of it as converting your data from a slow Python format to a fast NumPy format.

import numpy as np

# 1D array from list
arr1d = np.array([1, 2, 3, 4, 5])
print(arr1d)  # [1 2 3 4 5]

# 2D array from nested lists
arr2d = np.array([[1, 2, 3], [4, 5, 6]])
print(arr2d)
# [[1 2 3]
#  [4 5 6]]

# 3D array from nested lists
arr3d = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])
print(f"3D shape: {arr3d.shape}")  # (2, 2, 2)

Creating arrays from lists: Pass any Python list to np.array() to convert it into a NumPy array. For 1D arrays, use a simple flat list like [1, 2, 3]. For 2D arrays (matrices), use nested lists where each inner list becomes a row. The nesting level determines dimensions: one bracket = 1D, two brackets = 2D, three brackets = 3D, and so on.

Pro tip: NumPy creates a copy of your list data, so modifying the original list won't affect the array. All elements are converted to a common data type (usually int64 or float64) for efficient computation. If you mix integers and floats, all values become floats.

Built-in Generators

NumPy provides convenient functions to create arrays with specific patterns without having to manually type all the values. This is especially useful when you know the shape but not the exact values. Use zeros() for placeholders and ones() for arrays filled with 1s.

# Zeros and ones
zeros = np.zeros((3, 4))      # 3x4 array of zeros
ones = np.ones((2, 3))        # 2x3 array of ones

# Ranges
range_arr = np.arange(0, 10, 2)  # [0 2 4 6 8]
linspace = np.linspace(0, 1, 5)  # [0. 0.25 0.5 0.75 1.]

# Identity matrix
identity = np.eye(3)  # 3x3 identity matrix

np.zeros(shape) creates an array filled with zeros - perfect for initializing arrays before filling them with computed values. np.ones(shape) does the same but with ones, useful for masks or starting values in algorithms.

np.arange(start, stop, step) works like Python's range() but returns a NumPy array instead of a range object. It's ideal when you need evenly spaced integers or floats with a specific step size.

np.linspace(start, stop, num) creates exactly num evenly spaced values between start and stop (both included). Unlike arange(), you specify how many values you want, not the step size - great for plotting smooth curves.

np.eye(n) creates an identity matrix (1s on diagonal, 0s elsewhere). Identity matrices are fundamental in linear algebra - multiplying any matrix by the identity returns the original matrix.

Random Arrays

# Random values between 0 and 1
rand = np.random.rand(3, 3)

# Random integers
randint = np.random.randint(1, 100, size=(2, 4))

# Normal distribution (mean=0, std=1)
normal = np.random.randn(5)

# Set seed for reproducibility
np.random.seed(42)

np.random.rand(d0, d1, ...) generates random floats uniformly distributed between 0 and 1. Pass dimensions directly as arguments (not as a tuple). This is commonly used for initializing neural network weights or creating test data.

np.random.randint(low, high, size) generates random integers from low (inclusive) to high (exclusive). The size parameter accepts a tuple for multi-dimensional arrays. Perfect for simulations, sampling, or generating test indices.

np.random.randn(d0, d1, ...) generates samples from the standard normal distribution (mean=0, std=1). The 'n' stands for 'normal'. This is essential for statistical simulations and machine learning initialization.

np.random.seed(n) sets the random number generator's starting point. Using the same seed always produces the same sequence of random numbers - crucial for reproducible experiments, debugging, and sharing code that others can verify.

Special Arrays

import numpy as np

# Full array with specific value
full_arr = np.full((3, 3), 7)
print(full_arr)
# [[7 7 7]
#  [7 7 7]
#  [7 7 7]]

# Empty array (uninitialized - fast but random values!)
empty = np.empty((2, 3))  # Don't use values directly!

# Array like another
template = np.array([[1, 2], [3, 4]])
zeros_like = np.zeros_like(template)
ones_like = np.ones_like(template)

# Diagonal matrix
diag = np.diag([1, 2, 3])
print(diag)
# [[1 0 0]
#  [0 2 0]
#  [0 0 3]]

# Extract diagonal from matrix
matrix = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
diagonal = np.diag(matrix)
print(diagonal)  # [1 5 9]

np.full(shape, value) creates an array of any shape filled entirely with a specific value. Unlike zeros() or ones(), you choose the fill value - useful for initializing arrays with default values like -1 for "not found" or infinity for minimum-finding algorithms.

np.empty(shape) allocates memory without initializing values - it's the fastest way to create an array, but the values are garbage (whatever was in memory). Only use when you'll immediately overwrite all values.

np.zeros_like(arr) and np.ones_like(arr) create new arrays with the same shape and dtype as an existing array. This ensures compatibility when you need a result array matching your input array's structure.

np.diag() is a dual-purpose function: pass a 1D array to create a diagonal matrix with those values on the diagonal; pass a 2D matrix to extract its diagonal elements as a 1D array. Essential for linear algebra operations.

Creating Arrays from Functions

import numpy as np

# From function
def func(i, j):
    return i + j

arr = np.fromfunction(func, (3, 4), dtype=int)
print(arr)
# [[0 1 2 3]
#  [1 2 3 4]
#  [2 3 4 5]]

# Meshgrid for 2D coordinates
x = np.array([0, 1, 2])
y = np.array([0, 1, 2, 3])
xx, yy = np.meshgrid(x, y)
print("X grid:\n", xx)
print("Y grid:\n", yy)

# Useful for 2D function evaluation
z = xx**2 + yy**2  # Distance from origin squared

np.fromfunction(func, shape) creates an array by calling your function with the indices as arguments. The function receives the row index as the first argument and column index as the second. This is powerful for creating arrays where each element depends on its position.

np.meshgrid(x, y) takes two 1D arrays and creates two 2D arrays representing all combinations of x and y coordinates. Think of it as creating a grid where xx has x-values repeated down rows, and yy has y-values repeated across columns.

Why meshgrid? It's essential for 3D surface plots and contour plots. Instead of writing nested loops to evaluate a function at every (x, y) point, you simply write z = f(xx, yy) and NumPy computes all values at once using vectorization.

Common use case: Creating heatmaps, plotting mathematical surfaces like z = x² + y², or generating coordinate grids for image processing and computer graphics applications.

03

Array Anatomy

Arrays aren't just data - they have metadata that describes the data. Understanding these properties helps you debug errors, write correct code, and optimize performance. Think of these as the 'blueprint' or 'fingerprint' of your array.

Key Idea: Before operating on an array, always check its shape, size, and dtype. Mismatches are a common source of bugs.
Array Properties Visualized
Understanding shape, size, ndim, and dtype
2D Array Structure (3 x 4)
Col 0 Col 1 Col 2 Col 3 Row 0 1 2 3 4 Row 1 5 6 7 8 Row 2 9 10 11 12 4 columns (axis 1) 3 rows (axis 0)
arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])
.shape

Dimensions tuple

(3, 4)
.ndim

Number of axes

2
.size

Total elements

12
.dtype

Data type

int64
Quick Memory Trick
shape (rows, cols) tuple
ndim len(shape)
size product of shape
dtype int, float, bool...

Accessing Properties

import numpy as np

arr = np.array([[1, 2, 3], [4, 5, 6]])

print(arr.shape)   # (2, 3) - 2 rows, 3 columns
print(arr.ndim)    # 2 - number of dimensions
print(arr.size)    # 6 - total elements
print(arr.dtype)   # int64 - data type

Shape returns a tuple of dimensions. For a 2D array, it's (rows, columns).

Data Types

dtype Description Example
int32, int64Integersnp.array([1, 2, 3])
float32, float64Floating pointnp.array([1.5, 2.5])
boolBooleannp.array([True, False])
complex64Complex numbersnp.array([1+2j])
str_Stringsnp.array(['a', 'b'])
# Specify dtype when creating
arr_float = np.array([1, 2, 3], dtype=np.float64)
print(arr_float)  # [1. 2. 3.]

# Convert dtype
arr_int = arr_float.astype(np.int32)
print(arr_int)  # [1 2 3]

Use astype() to convert between types. Be careful converting floats to ints as decimals are truncated.

Practice: Array Creation

Task: Create an array of even numbers from 2 to 20 (inclusive).

Show Solution
import numpy as np

evens = np.arange(2, 21, 2)
print(evens)  # [ 2  4  6  8 10 12 14 16 18 20]

Task: Create a 3x3 array filled with zeros.

Show Solution
import numpy as np

zeros = np.zeros((3, 3))
print(zeros)

Task: Create an array of 10 evenly spaced values between 0 and 5.

Show Solution
import numpy as np

spaced = np.linspace(0, 5, 10)
print(spaced)
04

Indexing and Slicing

Getting data out of arrays is just as important as putting it in. NumPy supports three indexing methods: basic indexing (like Python lists), slicing (getting ranges), and boolean indexing (filtering with conditions). These skills are essential for extracting exactly the data you need.

Index reminder: NumPy, like Python, uses 0-based indexing. The first element is at index 0, the last at index -1.

1D Array Indexing

import numpy as np

arr = np.array([10, 20, 30, 40, 50])

print(arr[0])     # 10 - first element
print(arr[-1])    # 50 - last element
print(arr[1:4])   # [20 30 40] - slice
print(arr[::2])   # [10 30 50] - every 2nd element

Indexing works like Python lists. Negative indices count from the end. Slicing uses [start:stop:step].

2D Array Indexing

arr2d = np.array([[1, 2, 3],
                  [4, 5, 6],
                  [7, 8, 9]])

print(arr2d[0, 0])     # 1 - row 0, col 0
print(arr2d[1, 2])     # 6 - row 1, col 2
print(arr2d[0])        # [1 2 3] - entire row 0
print(arr2d[:, 1])     # [2 5 8] - entire column 1
print(arr2d[0:2, 1:3]) # [[2 3] [5 6]] - subarray

Single element access: Use arr[row, col] to get one element. Unlike nested Python lists where you'd write list[row][col], NumPy uses a single bracket with comma-separated indices - it's faster and cleaner.

Row access: arr[0] returns the entire first row as a 1D array. This is shorthand for arr[0, :] where the colon means "all columns".

Column access: arr[:, 1] returns the entire second column. The colon before the comma means "all rows", and 1 specifies the column index. This returns a 1D array, not a column vector.

Subarray slicing: arr[0:2, 1:3] extracts a rectangular region - rows 0 to 1 (not including 2) and columns 1 to 2 (not including 3). This is called "slicing" and creates a view of the original data, not a copy.

Boolean Indexing

arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])

# Create boolean mask
mask = arr > 5
print(mask)  # [False False False False False True True True True True]

# Apply mask to filter
filtered = arr[mask]
print(filtered)  # [ 6  7  8  9 10]

# Combined in one line
evens = arr[arr % 2 == 0]
print(evens)  # [ 2  4  6  8 10]

Boolean masks: When you write arr > 5, NumPy compares each element to 5 and returns an array of True/False values (called a "mask"). This mask has the exact same shape as your original array.

Applying the mask: When you use a boolean array as an index like arr[mask], NumPy returns only the elements where the mask is True. The result is always a 1D array containing the matching elements.

One-liner filtering: You can combine both steps: arr[arr > 5] creates the mask and applies it immediately. This is the most common pattern for filtering data - concise and readable.

Multiple conditions: Combine conditions using & (and), | (or), and ~ (not). Important: use parentheses around each condition, e.g., arr[(arr > 2) & (arr < 8)]. Python's and/or keywords don't work with arrays!

Fancy Indexing

import numpy as np

arr = np.array([10, 20, 30, 40, 50])

# Index with list of indices
indices = [0, 2, 4]
selected = arr[indices]
print(selected)  # [10 30 50]

Basic fancy indexing: Instead of using a single index or slice, you can pass a list of specific indices to select multiple elements at once. Here [0, 2, 4] picks the 1st, 3rd, and 5th elements. The result is a new array containing just those values in the order you specified.

Key difference from slicing: Slices like arr[0:5:2] can only select evenly-spaced elements. Fancy indexing lets you pick any arbitrary positions - even repeating indices like arr[[0, 0, 2]] to duplicate elements!

# Works in 2D too
matrix = np.arange(12).reshape(3, 4)
print(matrix)
# [[ 0  1  2  3]
#  [ 4  5  6  7]
#  [ 8  9 10 11]]

# Select specific elements
rows = [0, 1, 2]
cols = [1, 2, 3]
diagonal_elements = matrix[rows, cols]
print(diagonal_elements)  # [ 1  6 11]

2D fancy indexing: For 2D arrays, provide two lists - one for row indices and one for column indices. NumPy pairs them up: rows[0] with cols[0], rows[1] with cols[1], etc. So matrix[[0,1,2], [1,2,3]] selects elements at positions (0,1), (1,2), and (2,3).

Result shape: The output is a 1D array with the same length as your index lists. This is different from slicing which preserves the 2D structure. Think of it as "cherry-picking" specific cells from your matrix.

# Select entire rows
selected_rows = matrix[[0, 2]]
print(selected_rows)
# [[ 0  1  2  3]
#  [ 8  9 10 11]]

Selecting whole rows: When you provide only row indices like matrix[[0, 2]], NumPy returns those complete rows as a 2D array. This is equivalent to matrix[[0, 2], :]. It's perfect for extracting a subset of rows in any order.

Copy vs View: Unlike slicing which creates a "view" (reference to original data), fancy indexing always creates a copy. Modifying the result won't affect the original array. This is important for memory management and avoiding unintended side effects.

Modifying Values with Indexing

import numpy as np

arr = np.arange(10)
print("Original:", arr)  # [0 1 2 3 4 5 6 7 8 9]

# Modify slice
arr[2:5] = 99
print("After slice modify:", arr)  # [ 0  1 99 99 99  5  6  7  8  9]

Slice assignment: You can assign a single value to a slice, and NumPy will broadcast that value to all positions in the slice. Here arr[2:5] = 99 sets elements at indices 2, 3, and 4 to 99. This is called "broadcasting" a scalar to multiple elements.

Array assignment: You can also assign an array of the same size: arr[2:5] = [10, 20, 30] would set index 2 to 10, index 3 to 20, and index 4 to 30. The shapes must match or be broadcastable!

# Modify with boolean mask
arr = np.arange(10)
arr[arr > 5] = -1
print("After boolean modify:", arr)  # [ 0  1  2  3  4  5 -1 -1 -1 -1]

Boolean mask assignment: Combine filtering and modification in one step! arr[arr > 5] = -1 finds all elements greater than 5 and replaces them with -1. The original array is modified in-place - no new array is created.

Common use cases: Clamping values (arr[arr > 100] = 100), replacing invalid data (arr[arr < 0] = 0), or cleaning outliers. This pattern is incredibly powerful for data preprocessing and much faster than Python loops.

# Modify with fancy indexing
arr = np.arange(10)
arr[[1, 3, 5, 7]] = 100
print("After fancy modify:", arr)  # [  0 100   2 100   4 100   6 100   8   9]

Fancy index assignment: Use a list of indices to modify specific non-contiguous positions. arr[[1, 3, 5, 7]] = 100 sets elements at those exact indices to 100. You can assign a single value (broadcast) or a matching-length array.

Multiple different values: To assign different values to each position, use arr[[1, 3, 5, 7]] = [10, 30, 50, 70]. The index list and value array must have the same length. This is perfect for updating specific known positions in your data.

Practice: Indexing

Task: Given arr = np.arange(10), extract the last 3 elements.

Show Solution
import numpy as np

arr = np.arange(10)
last_three = arr[-3:]
print(last_three)  # [7 8 9]

Task: Given a 3x4 array, extract the second column.

Show Solution
import numpy as np

arr = np.arange(12).reshape(3, 4)
second_col = arr[:, 1]
print(second_col)  # [1 5 9]

Task: From arr = np.arange(20), get elements greater than 5 AND less than 15.

Show Solution
import numpy as np

arr = np.arange(20)
filtered = arr[(arr > 5) & (arr < 15)]
print(filtered)  # [ 6  7  8  9 10 11 12 13 14]

Task: Reverse np.arange(10) using slicing.

Show Solution
import numpy as np

arr = np.arange(10)
reversed_arr = arr[::-1]
print(reversed_arr)  # [9 8 7 6 5 4 3 2 1 0]

Task: Extract the diagonal elements from a 4x4 matrix using fancy indexing.

Show Solution
import numpy as np

matrix = np.arange(16).reshape(4, 4)
print(matrix)

# Method 1: np.diag
diagonal = np.diag(matrix)

# Method 2: Fancy indexing
indices = np.arange(4)
diagonal = matrix[indices, indices]
print(diagonal)  # [ 0  5 10 15]
05

Array Operations

Operations are where NumPy truly shines compared to Python lists. You can perform calculations on entire arrays at once without writing loops. This section covers element-wise operations (apply to each element), aggregations (combine elements into a single value), and broadcasting (applies smaller arrays to larger ones). Master these and your code will be fast, readable, and elegant.

Performance Tip: Always choose vectorized operations over loops. Your code will run 100x faster!

Element-wise Operations

import numpy as np

a = np.array([1, 2, 3, 4])
b = np.array([10, 20, 30, 40])

print(a + b)   # [11 22 33 44]
print(a * b)   # [10 40 90 160]
print(a ** 2)  # [ 1  4  9 16]
print(np.sqrt(a))  # [1. 1.414 1.732 2.]

Arithmetic operators: When you use +, -, *, /, ** between two arrays, NumPy applies the operation to each corresponding pair of elements. a + b adds a[0]+b[0], a[1]+b[1], etc. This is called "vectorized" operation - no loops needed!

Same-shape requirement: Both arrays must have the same shape, or one must be broadcastable to the other. If shapes don't match, NumPy raises a ValueError. Always check shapes with .shape before operations.

Universal functions (ufuncs): Functions like np.sqrt(), np.sin(), np.exp() are "universal functions" that apply to every element automatically. They're optimized in C and much faster than Python's math module with loops.

In-place operations: Use +=, *=, etc. to modify arrays without creating copies: a += 1 adds 1 to every element of a directly, saving memory for large arrays.

Aggregation Functions

arr = np.array([[1, 2, 3], [4, 5, 6]])

print(arr.sum())        # 21 - sum of all
print(arr.sum(axis=0))  # [5 7 9] - sum each column
print(arr.sum(axis=1))  # [6 15] - sum each row
print(arr.mean())       # 3.5
print(arr.min(), arr.max())  # 1, 6
print(arr.std())        # 1.707 - standard deviation

Global aggregation: Without the axis parameter, functions like sum(), mean(), min(), max() operate on ALL elements regardless of shape. The result is a single scalar value.

axis=0 (columns): Setting axis=0 collapses rows, computing one result per column. Think of it as "working down" the array vertically. For a 2D array with shape (3, 4), sum(axis=0) returns shape (4,).

axis=1 (rows): Setting axis=1 collapses columns, computing one result per row. Think of it as "working across" horizontally. For shape (3, 4), sum(axis=1) returns shape (3,).

Other useful aggregations: std() for standard deviation, var() for variance, prod() for product, argmin()/argmax() for indices of min/max values. These all accept the axis parameter.

Broadcasting

# Scalar broadcast
arr = np.array([[1, 2, 3], [4, 5, 6]])
result = arr + 10
print(result)  # [[11 12 13] [14 15 16]]

# 1D broadcast to 2D
row = np.array([1, 0, 1])
result = arr * row
print(result)  # [[1 0 3] [4 0 6]]

Scalar broadcasting: When you operate on an array with a single number (scalar), NumPy "broadcasts" that number to match the array's shape. arr + 10 adds 10 to every element - it's like the scalar expands to fill the entire array shape.

1D to 2D broadcasting: A 1D array can broadcast across a 2D array if their trailing dimensions match. Here, row has shape (3,) and arr has shape (2, 3). NumPy stretches row vertically to shape (2, 3) by repeating it for each row.

Why broadcasting matters: It lets you write concise code without explicit loops or creating large temporary arrays. arr * row multiplies each row by the same values without actually copying row in memory.

Broadcasting rules: Dimensions are compared right-to-left. Each dimension must either be equal, or one of them must be 1 (or missing). See the Broadcasting Deep Dive section for complete rules and visualizations.

Reshaping Arrays

arr = np.arange(12)
print(arr)  # [ 0  1  2  3  4  5  6  7  8  9 10 11]

# Reshape to 3x4
reshaped = arr.reshape(3, 4)
print(reshaped)
# [[ 0  1  2  3]
#  [ 4  5  6  7]
#  [ 8  9 10 11]]

reshape(new_shape): Changes the array's dimensions without modifying the underlying data. A 12-element 1D array can become 3×4, 4×3, 2×6, or even 2×2×3. The key rule: the product of all dimensions must equal the total number of elements.

Row-major order: NumPy fills the new shape row by row (C-style). Elements 0-3 become row 0, elements 4-7 become row 1, etc. This is called "row-major" order and is important when reshaping data that has a specific structure.

# Using -1 for automatic dimension calculation
auto_reshape = arr.reshape(3, -1)  # NumPy figures out 4 columns
print(auto_reshape.shape)  # (3, 4)

auto_reshape2 = arr.reshape(-1, 6)  # NumPy figures out 2 rows
print(auto_reshape2.shape)  # (2, 6)

The -1 shortcut: Use -1 for exactly one dimension and NumPy calculates it automatically. reshape(3, -1) on 12 elements gives (3, 4) because 12÷3=4. This is incredibly useful when processing batches of data where one dimension varies.

Common patterns: reshape(-1) flattens to 1D. reshape(-1, 1) creates a column vector. reshape(1, -1) creates a row vector. These are essential for preparing data for machine learning models.

# Flatten back to 1D
flat = reshaped.flatten()
print(flat)  # [ 0  1  2  3  4  5  6  7  8  9 10 11]

# ravel() - similar but returns a view when possible
raveled = reshaped.ravel()
print(raveled)  # [ 0  1  2  3  4  5  6  7  8  9 10 11]

flatten(): Always creates a new copy of the data as a 1D array. Safe to modify without affecting the original array. Use when you need an independent flattened copy.

ravel(): Returns a flattened view when possible (more memory efficient). If the original array is contiguous in memory, changes to the raveled array WILL affect the original! Use when you only need to read the data or when memory is tight.

# Transpose (swap rows/columns)
transposed = reshaped.T
print(transposed.shape)  # (4, 3)
print(transposed)
# [[ 0  4  8]
#  [ 1  5  9]
#  [ 2  6 10]
#  [ 3  7 11]]

Transpose (.T): Swaps rows and columns - a (3, 4) array becomes (4, 3). Element at [i, j] moves to [j, i]. The first row [0,1,2,3] becomes the first column. Essential for matrix multiplication where dimensions must align.

Transpose is a view: .T doesn't copy data; it just changes how NumPy interprets the indices. This makes it very fast, but modifying the transposed array also modifies the original. For higher dimensions, use np.transpose(arr, axes) to specify axis order.

Practice: Operations

Task: Create arr = [10, 20, 30, 40, 50] and print mean, min, and max.

Show Solution
import numpy as np

arr = np.array([10, 20, 30, 40, 50])
print(f"Mean: {arr.mean()}")  # 30.0
print(f"Min: {arr.min()}")    # 10
print(f"Max: {arr.max()}")    # 50

Task: Normalize arr to range 0-1 using (arr - min) / (max - min).

Show Solution
import numpy as np

arr = np.array([10, 20, 30, 40, 50])
normalized = (arr - arr.min()) / (arr.max() - arr.min())
print(normalized)  # [0. 0.25 0.5 0.75 1.]

Task: Create a 3x4 array and calculate the sum of each row.

Show Solution
import numpy as np

arr = np.arange(12).reshape(3, 4)
row_sums = arr.sum(axis=1)
print(row_sums)  # [ 6 22 38]

Task: Multiply a 2x3 matrix by a 3x2 matrix using @ operator.

Show Solution
import numpy as np

a = np.array([[1, 2, 3], [4, 5, 6]])  # 2x3
b = np.array([[1, 2], [3, 4], [5, 6]])  # 3x2
result = a @ b  # 2x2
print(result)  # [[22 28] [49 64]]

Task: Find the index of the maximum value in each row of a 2D array.

Show Solution
import numpy as np

arr = np.random.randint(0, 100, (3, 5))
max_indices = arr.argmax(axis=1)
print(f"Array:\n{arr}")
print(f"Max indices per row: {max_indices}")
06

Broadcasting Deep Dive

Broadcasting is one of NumPy's most powerful features, but it confuses many beginners. In simple terms: when you perform an operation on arrays of different shapes, NumPy automatically 'stretches' the smaller array to match the larger one. This doesn't use extra memory - it happens virtually. Once you understand broadcasting, you'll write much cleaner code.

Beginner Tip: Don't be intimidated by broadcasting rules. Start simple: broadcast a scalar to an array, then try a 1D array with a 2D array. The pattern becomes clear quickly!
Key Concept

Broadcasting Rules

When operating on two arrays, NumPy compares their shapes element-wise from right to left. Two dimensions are compatible when they are equal, or one of them is 1.

Rule: If shapes don't match, the smaller array is "stretched" to match the larger one without copying data.

Broadcasting Rules Explained (Beginner Guide)
Step-by-step breakdown with visual examples
1
Align Shapes from the RIGHT

NumPy compares dimensions starting from the rightmost (trailing) dimension and moves left.

Visual Example:

Array A: 3 4 5 Array B: 5 Compare! Start here

Shape (3, 4, 5) vs (5) - alignment starts from rightmost "5"

2
Dimensions Must Be EQUAL or ONE of Them is 1

For each dimension pair, they're compatible if: same size OR one is 1 OR one is missing.

COMPATIBLE

5 == 5

Same size

COMPATIBLE

5 vs 1

One is 1 (stretches)

INCOMPATIBLE

5 vs 3

Neither is 1!

Common Error: ValueError: operands could not be broadcast together means dimensions don't match and neither is 1!

3
The "1" Dimension STRETCHES (Virtually)

When a dimension is 1, NumPy virtually repeats that array along that axis. No extra memory is used!

BEFORE (shape: 1x3) 1 2 3 stretches to 3x3 AFTER (virtually 3x3) 1 2 3 1 2 3 1 2 3 No extra memory!

Dashed lines = "virtual" copies. NumPy is smart - it reuses the same data!

Step-by-Step Example: Can (3, 4) broadcast with (4,)?

Step 1: Align Right

(3, 4)
(4,)

Step 2: Compare

4 == 4

3 vs missing

Step 3: Result

(3, 4)

Broadcasts!

Broadcasting Visualization
How NumPy stretches arrays to match shapes
Scalar + Array
Simple
SCALAR 5 + ARRAY 1 2 3 broadcasts to 5 5 5 = 6 7 8
5 + np.array([1, 2, 3]) = [6, 7, 8]
Row + Column Broadcast
Advanced
ROW (1x3) 1 2 3 + COL (3x1) 10 20 30 = RESULT (3x3) 11 12 13 21 22 23 31 32 33 Row broadcasts down Column broadcasts right
np.array([[1,2,3]]) + np.array([[10],[20],[30]])
Broadcasting Magic - 3 Rules
1
Compare from Right

Align shapes from trailing dimensions

2
Match or 1

Dimensions must be equal or one is 1

3
Stretch the 1s

Dimensions of 1 stretch to match larger

Broadcasting Examples

import numpy as np

# Scalar broadcasting
arr = np.array([1, 2, 3, 4, 5])
result = arr * 10
print(result)  # [10 20 30 40 50]

Scalar broadcasting: When you multiply an array by a single number (scalar), NumPy automatically "broadcasts" that number to every position in the array. It's as if the scalar 10 becomes [10, 10, 10, 10, 10] to match the array shape, then element-wise multiplication happens.

No memory overhead: NumPy doesn't actually create a full array of 10s in memory - it's smart enough to apply the operation efficiently. This makes scalar operations both fast and memory-efficient.

# Row vector broadcast to 2D
matrix = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 8, 9]])
row = np.array([10, 20, 30])
result = matrix + row
print(result)
# [[11 22 33]
#  [14 25 36]
#  [17 28 39]]

Row vector broadcasting: The 1D array row has shape (3,) which aligns with the matrix's columns (3 columns). NumPy stretches the row vertically, repeating it for each row of the matrix. Row 0 gets [10, 20, 30] added, row 1 gets the same, and so on.

Alignment from right: NumPy aligns shapes from the rightmost dimension. Matrix shape (3, 3) and row shape (3,) - the trailing 3s match, so broadcasting works. If row had shape (2,), you'd get an error because 2 ≠ 3.

# Column vector broadcast
col = np.array([[100], [200], [300]])
result = matrix + col
print(result)
# [[101 102 103]
#  [204 205 206]
#  [307 308 309]]

Column vector broadcasting: The array col has shape (3, 1) - 3 rows, 1 column. NumPy stretches it horizontally, repeating the single column across all columns of the matrix. Row 0 gets 100 added to all its elements, row 1 gets 200 added, etc.

Creating column vectors: Note the double brackets [[100], [200], [300]] which creates a 2D array with shape (3, 1). Alternatively, use np.array([100, 200, 300]).reshape(-1, 1) or arr[:, np.newaxis] to convert a 1D array to a column vector.

Shape Compatibility Table

Array A Shape Array B Shape Result Shape Compatible?
(3, 4)(4,)(3, 4) Yes
(3, 4)(3, 1)(3, 4) Yes
(3, 4)(1, 4)(3, 4) Yes
(3, 4)()(3, 4) Yes (scalar)
(3, 4)(3,)Error No
(2, 3, 4)(3, 4)(2, 3, 4) Yes
(2, 3, 4)(2, 1, 4)(2, 3, 4) Yes

Outer Products with Broadcasting

import numpy as np

# Create outer product using broadcasting
a = np.array([1, 2, 3, 4])
b = np.array([10, 20, 30])

# Reshape a to column vector and broadcast
outer = a.reshape(-1, 1) * b
print(outer)
# [[ 10  20  30]
#  [ 20  40  60]
#  [ 30  60  90]
#  [ 40  80 120]]

# Same result using np.outer
print(np.outer(a, b))

What is an outer product? The outer product of two vectors creates a matrix where each element [i, j] is the product of a[i] and b[j]. For vectors of length 4 and 3, you get a 4×3 matrix containing all possible pairwise products.

Broadcasting trick: a.reshape(-1, 1) converts the 1D array a with shape (4,) into a column vector with shape (4, 1). When multiplied by b with shape (3,), broadcasting creates a (4, 3) result - exactly the outer product!

Why -1 in reshape? The -1 tells NumPy to automatically calculate that dimension. reshape(-1, 1) means "make it a column vector with as many rows as needed". It's shorthand that works regardless of the original array length.

Alternative: np.outer(): NumPy provides np.outer(a, b) which does exactly this operation. The broadcasting method is useful to understand because the same pattern applies to many other computations like distance matrices and correlation tables.

Practical Broadcasting: Centering Data

import numpy as np

# Sample data: 5 observations, 3 features
data = np.array([[1.0, 2.0, 3.0],
                 [4.0, 5.0, 6.0],
                 [7.0, 8.0, 9.0],
                 [10.0, 11.0, 12.0],
                 [13.0, 14.0, 15.0]])

# Center data by subtracting column means
column_means = data.mean(axis=0)
print(f"Column means: {column_means}")  # [7. 8. 9.]

centered = data - column_means  # Broadcasting!
print("Centered data:")
print(centered)
# [[-6. -6. -6.]
#  [-3. -3. -3.]
#  [ 0.  0.  0.]
#  [ 3.  3.  3.]
#  [ 6.  6.  6.]]

# Verify: centered columns have mean 0
print(f"New column means: {centered.mean(axis=0)}")  # [0. 0. 0.]

What is centering? Centering data means shifting each feature (column) so its mean becomes zero. This is a critical preprocessing step in machine learning - many algorithms like PCA, SVM, and neural networks work better with centered data.

Step 1 - Calculate column means: data.mean(axis=0) computes the mean of each column separately, returning a 1D array with shape (3,) - one mean per feature. The axis=0 collapses rows to get column-wise statistics.

Step 2 - Broadcast subtraction: data - column_means subtracts the means from each row. The (5, 3) matrix minus (3,) vector broadcasts the means across all 5 rows automatically. Each column gets its own mean subtracted from every element in that column.

Without broadcasting: You'd need a loop like for i in range(data.shape[0]): data[i] -= column_means. Broadcasting replaces this with a single, readable line that's also 10-100x faster for large datasets.

Verification: After centering, centered.mean(axis=0) should return zeros (or very small numbers due to floating-point precision). This confirms each column now has mean zero.

Practice: Broadcasting

Task: Add 100 to every element in a 3x3 array using broadcasting.

Show Solution
import numpy as np

arr = np.arange(9).reshape(3, 3)
result = arr + 100
print(result)
# [[100 101 102]
#  [103 104 105]
#  [106 107 108]]

Task: Multiply row 0 by 1, row 1 by 2, row 2 by 3 using a column vector.

Show Solution
import numpy as np

arr = np.ones((3, 4))
factors = np.array([[1], [2], [3]])  # Column vector
result = arr * factors
print(result)
# [[1. 1. 1. 1.]
#  [2. 2. 2. 2.]
#  [3. 3. 3. 3.]]

Task: Normalize each column of a matrix to the range 0-1 using broadcasting.

Show Solution
import numpy as np

data = np.array([[10, 200, 3000],
                 [20, 400, 1000],
                 [30, 100, 2000]])

col_min = data.min(axis=0)
col_max = data.max(axis=0)
normalized = (data - col_min) / (col_max - col_min)
print(normalized)
# [[0.  0.333 1.   ]
#  [0.5 1.    0.   ]
#  [1.  0.    0.5  ]]

Task: Create a matrix where element [i,j] = |a[i] - b[j]| using broadcasting.

Show Solution
import numpy as np

a = np.array([1, 2, 3, 4])
b = np.array([10, 20, 30])

# Reshape a to column, subtract b (row)
distance = np.abs(a.reshape(-1, 1) - b)
print(distance)
# [[ 9 19 29]
#  [ 8 18 28]
#  [ 7 17 27]
#  [ 6 16 26]]
07

Mathematical Functions

NumPy includes virtually every math function you might need - from basic trigonometry to logarithms to statistical functions. All of these are implemented in C for maximum speed. Rather than looping through an array and applying Python's math module to each element, NumPy functions work on entire arrays at once. This makes them not just faster but also more readable.

Remember: All math functions operate element-wise by default. np.sin(array) applies sine to every element, not to the array as a whole.

Trigonometric Functions

import numpy as np

angles = np.array([0, np.pi/6, np.pi/4, np.pi/3, np.pi/2])

print("Sine:", np.sin(angles).round(3))
# [0.    0.5   0.707 0.866 1.   ]

print("Cosine:", np.cos(angles).round(3))
# [1.    0.866 0.707 0.5   0.   ]

print("Tangent:", np.tan(angles[:4]).round(3))
# [0.    0.577 1.    1.732]

# Inverse trig functions
print("Arcsin(0.5):", np.arcsin(0.5))  # 0.524 (pi/6)

Radians, not degrees: NumPy's trig functions expect angles in radians. np.pi represents π (≈3.14159). Common angles: π/6 = 30°, π/4 = 45°, π/3 = 60°, π/2 = 90°. If you have degrees, convert with np.deg2rad(degrees).

Standard trig functions: np.sin(), np.cos(), np.tan() work element-wise on arrays. Pass an array of angles, get an array of results. The .round(3) just limits decimal places for display.

Inverse functions: np.arcsin(), np.arccos(), np.arctan() return angles (in radians) given a ratio. arcsin(0.5) returns π/6 because sin(π/6) = 0.5.

Also available: Hyperbolic functions (sinh, cosh, tanh) and their inverses (arcsinh, etc.) for advanced math applications.

Exponential and Logarithmic

import numpy as np

arr = np.array([1, 2, 3, 4, 5])

# Exponential
print("e^x:", np.exp(arr).round(2))
# [  2.72   7.39  20.09  54.6  148.41]

# Logarithms
print("ln(x):", np.log(arr).round(3))
# [0.    0.693 1.099 1.386 1.609]

print("log10(x):", np.log10(arr).round(3))
# [0.    0.301 0.477 0.602 0.699]

print("log2(x):", np.log2(arr).round(3))
# [0.    1.    1.585 2.    2.322]

# Power functions
print("x^2:", np.power(arr, 2))  # [ 1  4  9 16 25]
print("sqrt(x):", np.sqrt(arr).round(3))
# [1.    1.414 1.732 2.    2.236]

Exponential (np.exp): Calculates e^x for each element, where e ≈ 2.71828. This is fundamental in statistics (normal distribution), finance (compound interest), and machine learning (softmax, sigmoid functions).

Natural logarithm (np.log): The inverse of exp(). np.log(x) returns the power you'd raise e to get x. So np.log(np.e) = 1. Warning: log of zero or negative numbers returns -inf or nan!

Other logarithm bases: np.log10() is base-10 (common in decibels, pH). np.log2() is base-2 (common in computer science, information theory). For arbitrary base b: use np.log(x) / np.log(b).

Power functions: np.power(base, exp) computes base^exp element-wise. np.sqrt() is shorthand for power of 0.5. For cube roots, use np.cbrt() or np.power(x, 1/3).

Rounding Functions

import numpy as np

arr = np.array([1.234, 2.567, 3.891, -1.234, -2.567])

print("Round:", np.round(arr, 1))
# [ 1.2  2.6  3.9 -1.2 -2.6]

print("Floor:", np.floor(arr))
# [ 1.  2.  3. -2. -3.]

print("Ceil:", np.ceil(arr))
# [ 2.  3.  4. -1. -2.]

print("Truncate:", np.trunc(arr))
# [ 1.  2.  3. -1. -2.]

# Absolute value
print("Abs:", np.abs(arr).round(3))
# [1.234 2.567 3.891 1.234 2.567]

np.round(arr, decimals): Rounds to the specified number of decimal places. np.round(3.567, 1) gives 3.6. Note: Python's "banker's rounding" rounds 0.5 to the nearest even number, so 2.5 rounds to 2, not 3.

np.floor(): Always rounds DOWN toward negative infinity. floor(2.9) = 2, but floor(-2.1) = -3 (not -2!). Think of it as "the largest integer less than or equal to x".

np.ceil(): Always rounds UP toward positive infinity. ceil(2.1) = 3, and ceil(-2.9) = -2. Think of it as "the smallest integer greater than or equal to x".

np.trunc(): Removes the decimal part, rounding toward zero. trunc(2.9) = 2 and trunc(-2.9) = -2. Unlike floor, negative numbers round up (toward zero). np.abs() returns absolute values.

Statistical Functions

import numpy as np

data = np.array([23, 45, 67, 89, 12, 34, 56, 78, 90, 11])

print(f"Mean: {np.mean(data)}")           # 50.5
print(f"Median: {np.median(data)}")       # 50.5
print(f"Std Dev: {np.std(data):.2f}")     # 28.73
print(f"Variance: {np.var(data):.2f}")    # 825.45
print(f"Min: {np.min(data)}")             # 11
print(f"Max: {np.max(data)}")             # 90
print(f"Sum: {np.sum(data)}")             # 505
print(f"Product: {np.prod(data[:5])}")    # 23*45*67*89*12

# Percentiles
print(f"25th percentile: {np.percentile(data, 25)}")
print(f"75th percentile: {np.percentile(data, 75)}")

# Cumulative operations
print("Cumsum:", np.cumsum(data[:5]))
# [ 23  68 135 224 236]

Central tendency: np.mean() calculates the arithmetic average. np.median() finds the middle value when sorted - more robust to outliers than mean. For weighted averages, use np.average(data, weights=w).

Spread measures: np.std() calculates standard deviation (average distance from mean). np.var() is variance (std squared). By default, these use N in the denominator; for sample statistics, use ddof=1.

Percentiles: np.percentile(data, 25) finds the value below which 25% of data falls. Median is the 50th percentile. The interquartile range (IQR) is percentile(data, 75) - percentile(data, 25).

Cumulative operations: np.cumsum() returns running totals - each element is the sum of all previous elements plus itself. np.cumprod() does the same for products. Great for calculating cumulative distributions or running balances.

Comparison and Logic Functions

import numpy as np

a = np.array([1, 2, 3, 4, 5])
b = np.array([5, 4, 3, 2, 1])

# Element-wise comparison
print("a > b:", a > b)   # [False False False  True  True]
print("a == b:", a == b) # [False False  True False False]

# Aggregated comparisons
print("All a > 0:", np.all(a > 0))  # True
print("Any a > 4:", np.any(a > 4))  # True

# Where function (conditional selection)
result = np.where(a > b, a, b)  # Pick larger element
print("Max of each:", result)  # [5 4 3 4 5]

# Clip values to range
arr = np.array([1, 5, 10, 15, 20])
clipped = np.clip(arr, 5, 15)
print("Clipped:", clipped)  # [ 5  5 10 15 15]

Element-wise comparison: Operators like >, <, ==, != compare arrays element-by-element, returning a boolean array. This is the foundation for boolean indexing and filtering.

Aggregated checks: np.all(condition) returns True only if ALL elements satisfy the condition. np.any(condition) returns True if ANY element satisfies it. Useful for validation: np.all(arr > 0) checks if all values are positive.

np.where(): A powerful conditional selector. np.where(condition, x, y) returns x where condition is True, y where False. If x and y are arrays, it picks element-by-element. With just condition, it returns indices where True.

np.clip(): Restricts values to a range. np.clip(arr, min, max) replaces values below min with min, above max with max. Perfect for bounding data or preventing overflow in calculations.

Set Operations

import numpy as np

a = np.array([1, 2, 3, 4, 5])
b = np.array([3, 4, 5, 6, 7])

# Union (all unique elements from both)
union = np.union1d(a, b)
print("Union:", union)  # [1 2 3 4 5 6 7]

np.union1d(a, b): Returns all unique elements that appear in either array a OR array b (or both). Like the mathematical set union (A ∪ B). The result is always sorted and contains no duplicates.

The "1d" suffix: All set operations have "1d" in their name because they work on 1D arrays. If you pass a 2D array, it gets flattened first. This is different from most NumPy functions that preserve dimensions.

# Intersection (common elements)
intersection = np.intersect1d(a, b)
print("Intersection:", intersection)  # [3 4 5]

np.intersect1d(a, b): Returns elements that appear in BOTH arrays (A ∩ B). Here, 3, 4, and 5 are in both a and b. The result is sorted and unique.

Real-world use: Finding common customers between two stores, overlapping skills between job candidates, or shared features between datasets. Intersection helps identify what two groups have in common.

# Difference (in a but not in b)
diff = np.setdiff1d(a, b)
print("Difference (a-b):", diff)  # [1 2]

# Symmetric difference (in either, but not both)
sym_diff = np.setxor1d(a, b)
print("Symmetric diff:", sym_diff)  # [1 2 6 7]

np.setdiff1d(a, b): Returns elements in a that are NOT in b (A - B). Order matters! setdiff1d(a, b) gives [1, 2] but setdiff1d(b, a) would give [6, 7]. Think of it as "what's exclusive to the first array?"

np.setxor1d(a, b): Symmetric difference - elements in either array but NOT in both (A △ B). It's like union minus intersection. Here we get [1, 2] from a-only and [6, 7] from b-only, but not 3, 4, 5 which appear in both.

# Check membership
print("Is in:", np.isin([1, 2, 6], a))  # [ True  True False]

np.isin(elements, test_array): Checks each element of the first argument against the second array. Returns a boolean array of the same shape as the first argument. Here, 1 is in a (True), 2 is in a (True), 6 is not in a (False).

Filtering use case: Combine with boolean indexing: data[np.isin(data['category'], allowed_categories)] keeps only rows with valid categories. Much faster than Python loops for large datasets.

Special Values

import numpy as np

# Infinity
print(np.inf)        # inf
print(np.inf > 1e308) # True
print(-np.inf)       # -inf
print(1 / np.inf)    # 0.0

np.inf (Infinity): Represents positive infinity - a value larger than any finite number. np.inf > 1e308 is True because infinity beats even the largest float. -np.inf is negative infinity (smaller than any finite number).

When infinity appears: Division by zero with floats gives infinity (1.0 / 0.0 = inf). Overflow in calculations can produce infinity. It's also useful as an initial value: min_val = np.inf ensures any real number will be smaller.

# Not a Number (NaN)
arr = np.array([1.0, np.nan, 3.0])
print(np.isnan(arr))  # [False  True False]

# Critical: NaN comparisons are always False!
print(np.nan == np.nan)  # False (!)
print(np.nan != np.nan)  # True (!)

np.nan (Not a Number): Represents missing, undefined, or invalid values. Common sources: 0/0, inf - inf, or reading missing data from files. NaN is a float value, so integer arrays can't contain NaN (use masked arrays instead).

The NaN trap: np.nan == np.nan is FALSE by IEEE standard! This catches many beginners. Never use == np.nan to check for NaN. Always use np.isnan(value) which correctly identifies NaN values.

# NaN-safe statistics
print(np.nanmean(arr))  # 2.0 (ignores NaN)
print(np.nansum(arr))   # 4.0
print(np.nanstd(arr))   # 1.0

# Regular functions propagate NaN
print(np.mean(arr))     # nan (contaminated!)

NaN propagation: Regular NumPy functions "propagate" NaN - if any element is NaN, the result is NaN. np.mean([1, nan, 3]) returns nan, not 2. This can silently corrupt your calculations!

NaN-safe functions: Functions prefixed with "nan" ignore NaN values: nanmean, nansum, nanstd, nanvar, nanmin, nanmax, nanargmin, nanargmax. They compute statistics using only the valid (non-NaN) values.

# Replace NaN with a value
arr = np.array([1.0, np.nan, 3.0])
arr[np.isnan(arr)] = 0
print(arr)  # [1. 0. 3.]

# Or use np.nan_to_num
arr2 = np.array([1.0, np.nan, np.inf, -np.inf])
clean = np.nan_to_num(arr2, nan=0.0, posinf=999, neginf=-999)
print(clean)  # [  1.   0. 999. -999.]

Replacing NaN: Use boolean indexing with np.isnan() to find and replace NaN values. arr[np.isnan(arr)] = 0 replaces all NaN with 0. Choose a replacement that makes sense for your data (0, mean, median, etc.).

np.nan_to_num(): A convenient function that replaces NaN and infinity in one call. You can specify custom replacement values for nan, positive infinity (posinf), and negative infinity (neginf). Great for cleaning data before further processing.

# Check finite values
arr = np.array([1, np.inf, np.nan, -np.inf])
print(np.isfinite(arr))  # [ True False False False]
print(np.isinf(arr))     # [False  True False  True]
print(np.isnan(arr))     # [False False  True False]

Validation functions: np.isfinite() returns True only for "normal" numbers - not NaN and not infinity. np.isinf() detects both positive and negative infinity. np.isnan() detects NaN values.

Data cleaning pattern: Before calculations, validate your data: if not np.all(np.isfinite(data)): handle_bad_data(). This catches NaN and infinity early, preventing mysterious bugs in downstream calculations.

Function Reference

Category Function Description
Trigsin, cos, tanTrigonometric functions
arcsin, arccos, arctanInverse trig functions
sinh, cosh, tanhHyperbolic functions
deg2rad, rad2degAngle conversion
Exp/Logexp, exp2, expm1Exponential functions
log, log10, log2Logarithms
power, sqrtPowers and roots
cbrtCube root
Roundround, aroundRound to decimals
floor, ceilRound down/up
trunc, fixTruncate to integer
rintRound to nearest int
Statsmean, medianCentral tendency
std, varSpread measures
min, max, ptpRange (ptp = peak-to-peak)
percentile, quantileDistribution percentiles

Practice: Mathematical Functions

Task: Create 100 points of a sine wave from 0 to 2*pi.

Show Solution
import numpy as np

x = np.linspace(0, 2*np.pi, 100)
y = np.sin(x)
print(f"First 5 y values: {y[:5].round(3)}")
# [0.    0.063 0.127 0.189 0.249]

Task: Normalize data to z-scores: (x - mean) / std

Show Solution
import numpy as np

data = np.array([10, 20, 30, 40, 50])
z_scores = (data - np.mean(data)) / np.std(data)
print(f"Z-scores: {z_scores.round(3)}")
# [-1.414 -0.707  0.     0.707  1.414]

# Verify: z-scores have mean=0, std=1
print(f"Mean: {z_scores.mean():.10f}")  # ~0
print(f"Std: {z_scores.std():.10f}")    # ~1

Task: Implement softmax: exp(x) / sum(exp(x))

Show Solution
import numpy as np

def softmax(x):
    exp_x = np.exp(x - np.max(x))  # Subtract max for numerical stability
    return exp_x / exp_x.sum()

logits = np.array([2.0, 1.0, 0.1])
probs = softmax(logits)
print(f"Probabilities: {probs.round(3)}")
# [0.659 0.242 0.099]
print(f"Sum: {probs.sum()}")  # 1.0

Task: Calculate 3-day moving average of stock prices.

Show Solution
import numpy as np

prices = np.array([100, 102, 101, 105, 110, 108, 112])

# Using cumsum for efficient moving average
def moving_average(arr, window):
    cumsum = np.cumsum(arr)
    cumsum[window:] = cumsum[window:] - cumsum[:-window]
    return cumsum[window - 1:] / window

ma3 = moving_average(prices, 3)
print(f"3-day MA: {ma3.round(2)}")
# [101.   102.67 105.33 107.67 110.  ]
08

Array Manipulation

Often you need to reorganize your data into a different shape. For example, you might receive data as a flat 1D array but need it as a 2D matrix for calculations. Or you might need to combine multiple arrays, split one array into pieces, or add new dimensions. NumPy makes these operations fast and memory-efficient, allowing you to focus on your problem rather than data management.

Key Advantage: Reshape and similar operations create 'views' of data when possible, not copies. This means they're nearly free in terms of memory!

Reshaping Arrays

import numpy as np

arr = np.arange(12)
print("Original:", arr)  # [ 0  1  2  3  4  5  6  7  8  9 10 11]

# Reshape to 2D
reshaped = arr.reshape(3, 4)
print("3x4:\n", reshaped)

The reshape method transforms a 1D array into any compatible multi-dimensional shape. When you call reshape(3, 4), you're asking NumPy to arrange 12 elements into 3 rows and 4 columns. The total number of elements must remain the same—12 elements can become 3×4, 4×3, 2×6, 6×2, or any other combination that multiplies to 12.

Reshaping creates a "view" of the original data when possible, meaning both arrays share the same memory. This is extremely efficient because no data is copied. However, it also means changes to one array may affect the other. Understanding this behavior is crucial when working with large datasets where memory efficiency matters.

# Reshape with -1 (auto-calculate dimension)
auto = arr.reshape(2, -1)  # 2 rows, columns auto-calculated
print("2x?:", auto.shape)  # (2, 6)

The -1 placeholder is a powerful shortcut that tells NumPy to automatically calculate the missing dimension based on the array's total size. When you specify reshape(2, -1) for a 12-element array, NumPy figures out that -1 must be 6 (since 2 × 6 = 12). This is especially useful when processing batches of data where one dimension might vary.

You can only use one -1 in a reshape call—NumPy can solve for one unknown dimension, but not multiple. This feature is commonly used in machine learning pipelines where you might know you want a certain batch size but want the other dimensions calculated automatically based on the input data shape.

# Flatten back to 1D
flat = reshaped.flatten()  # Returns copy
ravel = reshaped.ravel()   # Returns view when possible

When you need to convert a multi-dimensional array back to 1D, NumPy offers two methods with different behaviors. flatten() always creates a completely new copy of the data, guaranteeing independence from the original array. This is safer when you need to modify the flattened array without affecting the original.

ravel() is more memory-efficient because it returns a view when the array's memory layout allows it. A view shares memory with the original, so changes propagate between them. Use ravel() for read-only operations or when memory is constrained, and flatten() when you need guaranteed independence between the arrays.

# Add dimension with newaxis
arr1d = np.array([1, 2, 3])
row = arr1d[np.newaxis, :]  # Shape: (1, 3)
col = arr1d[:, np.newaxis]  # Shape: (3, 1)
print(f"Row shape: {row.shape}, Col shape: {col.shape}")

np.newaxis is a special index that inserts a new dimension of size 1 at the specified position. When you write arr[np.newaxis, :], you're adding a dimension at the front, converting a 1D array of shape (3,) into a 2D row vector of shape (1, 3). Conversely, arr[:, np.newaxis] adds a dimension at the end, creating a column vector of shape (3, 1).

This technique is essential for broadcasting operations. When you need to perform calculations between arrays of different dimensions, adding dimensions with newaxis allows NumPy to properly align and broadcast the arrays. It's commonly used when you need to treat a 1D array as either a row or column vector for matrix operations.

Transposing Arrays

import numpy as np

arr = np.array([[1, 2, 3],
                [4, 5, 6]])
print("Original shape:", arr.shape)  # (2, 3)

Transposing is a fundamental operation that swaps the rows and columns of a matrix. Our original array has 2 rows and 3 columns, giving it shape (2, 3). This is a common arrangement for data where each row might represent a sample and each column represents a feature. Understanding shape is crucial before transposing.

In NumPy, shape is always represented as a tuple showing the size of each dimension. For 2D arrays, the first number is rows and the second is columns. When working with higher-dimensional arrays (3D, 4D, etc.), each position in the shape tuple represents the size along that axis.

# Transpose
transposed = arr.T
print("Transposed shape:", transposed.shape)  # (3, 2)
print(transposed)
# [[1 4]
#  [2 5]
#  [3 6]]

The .T attribute is the simplest way to transpose a 2D array. What was previously at position [i, j] is now at position [j, i]. Our 2×3 matrix becomes a 3×2 matrix—the first row [1, 2, 3] becomes the first column [1, 4], and similarly for other elements. This operation is essential for matrix multiplication and many machine learning algorithms.

Like reshape, transpose returns a view of the original data, not a copy. This means it's very fast and memory-efficient, but modifications to the transposed array will affect the original. Transpose is commonly used when you need to switch between row-major and column-major data formats or when preparing data for specific mathematical operations.

# For higher dimensions, use transpose with axis order
arr3d = np.arange(24).reshape(2, 3, 4)
print("3D shape:", arr3d.shape)  # (2, 3, 4)
reordered = arr3d.transpose(1, 0, 2)
print("Reordered shape:", reordered.shape)  # (3, 2, 4)

For arrays with more than 2 dimensions, .T simply reverses all axes, which isn't always what you want. The transpose() method allows you to specify exactly how axes should be reordered. The argument (1, 0, 2) means: put axis 1 first, axis 0 second, and axis 2 last. This transforms shape (2, 3, 4) into (3, 2, 4).

This precise control is essential in deep learning when working with batches of images or sequences. Different frameworks expect different axis orderings—some want (batch, channels, height, width) while others want (batch, height, width, channels). The transpose method lets you convert between these formats efficiently without copying data.

Stacking and Concatenating

import numpy as np

a = np.array([[1, 2], [3, 4]])
b = np.array([[5, 6], [7, 8]])

Combining arrays is a common operation when you need to merge data from different sources or build larger datasets from smaller pieces. Here we create two 2×2 matrices that we'll combine in different ways. Array 'a' contains values 1-4 and array 'b' contains values 5-8.

Before combining arrays, they must have compatible shapes along the non-joining axes. For vertical stacking, columns must match; for horizontal stacking, rows must match. Understanding these requirements prevents common errors when working with real-world data of varying shapes.

# Vertical stack (add rows)
vstacked = np.vstack([a, b])
print("vstack:\n", vstacked)
# [[1 2]
#  [3 4]
#  [5 6]
#  [7 8]]

vstack (vertical stack) combines arrays by placing them on top of each other, essentially adding more rows. Our two 2×2 arrays become one 4×2 array. Think of it like stacking papers on a desk—each array becomes a new section of rows in the combined result.

This operation is equivalent to concatenating along axis 0 (the row axis). It's commonly used when you have data in separate batches and need to combine them into a single dataset, such as when loading training data from multiple files or accumulating results from iterative processing.

# Horizontal stack (add columns)
hstacked = np.hstack([a, b])
print("hstack:\n", hstacked)
# [[1 2 5 6]
#  [3 4 7 8]]

hstack (horizontal stack) combines arrays by placing them side by side, adding more columns. Our two 2×2 arrays become one 2×4 array. The first row of the result contains the first rows of both input arrays joined together, and similarly for subsequent rows.

This operation is equivalent to concatenating along axis 1 (the column axis). It's useful when you have different features stored in separate arrays and need to combine them into a single feature matrix, such as when merging datasets that share the same samples but have different measurements.

# Concatenate with axis parameter
concat_v = np.concatenate([a, b], axis=0)  # Same as vstack
concat_h = np.concatenate([a, b], axis=1)  # Same as hstack

# Depth stack (3D)
dstacked = np.dstack([a, b])
print("dstack shape:", dstacked.shape)  # (2, 2, 2)

np.concatenate is the general-purpose function that works along any axis you specify. With axis=0, it behaves like vstack; with axis=1, like hstack. This explicit control is clearer when working with higher-dimensional arrays where "vertical" and "horizontal" become ambiguous.

dstack (depth stack) adds a third dimension, stacking arrays "behind" each other. Two 2×2 arrays become a 2×2×2 3D array. This is useful in image processing where you might stack color channels, or in time series analysis where you stack multiple timesteps together.

Splitting Arrays

import numpy as np

arr = np.arange(16).reshape(4, 4)
print("Original:\n", arr)

Splitting is the inverse of stacking—it divides one array into multiple smaller arrays. Here we create a 4×4 array containing values 0 through 15, which gives us a nice symmetric shape that can be split evenly in multiple ways.

Splitting is commonly used when you need to divide data into training and testing sets, process large arrays in chunks that fit in memory, or extract specific portions of a dataset for separate analysis. NumPy provides several splitting functions optimized for different use cases.

# Split into equal parts
top, bottom = np.vsplit(arr, 2)
print("Top half:\n", top)
# [[0 1 2 3]
#  [4 5 6 7]]

vsplit (vertical split) divides an array into multiple sub-arrays along the row axis. When you pass the number 2, it splits into 2 equal parts—the top half and bottom half. The 4-row array becomes two 2-row arrays. The number of rows must be evenly divisible by the split count.

This function returns a list of arrays that you can unpack directly into variables. It's useful for separating header rows from data rows, dividing images into top and bottom sections, or creating train/test splits where you want an exact split ratio.

left, right = np.hsplit(arr, 2)
print("Left half:\n", left)
# [[ 0  1]
#  [ 4  5]
#  [ 8  9]
#  [12 13]]

hsplit (horizontal split) divides an array along the column axis. Splitting our 4×4 array into 2 parts gives us two 4×2 arrays—the left columns and the right columns. Each row is effectively cut in half, with the first two columns going to 'left' and the last two to 'right'.

This is useful when different columns represent different types of data that need separate processing. For example, you might split a matrix to separate features from labels, or divide a wide dataset into chunks for parallel processing.

# Split at specific indices
a, b, c = np.split(arr, [1, 3], axis=0)
# a: rows 0, b: rows 1-2, c: row 3

The general split() function allows you to specify exactly where to cut the array using index positions. The indices [1, 3] create cuts before row 1 and before row 3, resulting in three pieces: rows 0 (before index 1), rows 1-2 (between indices 1 and 3), and row 3 (from index 3 to the end).

This flexibility is essential when you need unequal splits. Unlike vsplit and hsplit which require equal divisions, split() lets you extract specific portions of your data. The axis parameter controls whether you're splitting rows (axis=0) or columns (axis=1), making it versatile for any splitting scenario.

Repeating and Tiling

import numpy as np

arr = np.array([1, 2, 3])

# Repeat each element
repeated = np.repeat(arr, 3)
print("Repeat:", repeated)
# [1 1 1 2 2 2 3 3 3]

np.repeat() duplicates each individual element a specified number of times. When you repeat [1, 2, 3] by 3, each element appears three times consecutively: 1 appears three times, then 2 appears three times, then 3 appears three times. The result is a longer array where elements are grouped together.

This is useful for creating expanded datasets, upsampling data, or preparing arrays for broadcasting operations. For example, if you have category labels and need to duplicate them to match the length of a detailed dataset, repeat() handles this efficiently.

# Tile the entire array
tiled = np.tile(arr, 3)
print("Tile:", tiled)
# [1 2 3 1 2 3 1 2 3]

np.tile() duplicates the entire array as a unit, like laying floor tiles. When you tile [1, 2, 3] by 3, the complete array is repeated three times in sequence: [1, 2, 3] then [1, 2, 3] then [1, 2, 3]. The pattern of the original array is preserved and repeated.

The key difference from repeat is what gets duplicated: repeat duplicates individual elements, while tile duplicates the whole array structure. Tile is commonly used to create periodic patterns, replicate templates across larger spaces, or build matrices with repeating block structures.

# 2D tiling
arr2d = np.array([[1, 2], [3, 4]])
tiled2d = np.tile(arr2d, (2, 3))  # 2 vertical, 3 horizontal
print("Tiled 2D shape:", tiled2d.shape)  # (4, 6)

For multi-dimensional arrays, tile() accepts a tuple specifying how many times to repeat along each axis. The tuple (2, 3) means: repeat 2 times vertically (along rows) and 3 times horizontally (along columns). A 2×2 array tiled (2, 3) becomes a 4×6 array—2×2 = 4 rows, 2×3 = 6 columns.

This 2D tiling creates a checkerboard-like pattern where your original array appears 6 times (2 rows × 3 columns of tiles). It's extremely useful in image processing for creating texture patterns, in numerical simulations for building periodic boundary conditions, or in any application requiring regular repeating structures.

Inserting and Deleting

import numpy as np

arr = np.array([1, 2, 3, 4, 5])

# Insert values
new_arr = np.insert(arr, 2, [10, 11])  # Insert at index 2
print("After insert:", new_arr)  # [ 1  2 10 11  3  4  5]

np.insert() adds new values at a specified position in the array. The three arguments are: the original array, the index position where insertion starts, and the values to insert. When you insert [10, 11] at index 2, the existing elements from index 2 onward shift to the right to make room.

Importantly, insert() returns a new array—it doesn't modify the original. This is because NumPy arrays have fixed sizes after creation. Creating a new array ensures you always know exactly what you're working with, avoiding unexpected mutations that could cause bugs in complex code.

# Delete values
new_arr = np.delete(arr, [1, 3])  # Delete indices 1 and 3
print("After delete:", new_arr)  # [1 3 5]

np.delete() removes elements at specified indices. You can pass a single index or a list of indices to remove multiple elements at once. Deleting indices 1 and 3 from [1, 2, 3, 4, 5] removes the second element (2) and fourth element (4), leaving [1, 3, 5].

Like insert, delete returns a new array rather than modifying in place. This is by design—NumPy's contiguous memory layout means you can't simply remove elements without reallocating the entire array. When deleting many elements, it's more efficient to use boolean masking, which we covered in the indexing section.

# Append values
new_arr = np.append(arr, [6, 7, 8])
print("After append:", new_arr)  # [1 2 3 4 5 6 7 8]

np.append() adds values to the end of an array, similar to Python list's append but with different behavior. It concatenates the new values to the original array and returns a completely new array. The original remains unchanged, and the result contains all original elements followed by the appended values.

A critical performance note: repeatedly appending to arrays in a loop is very slow because each append creates a new array. If you need to build an array incrementally, use a Python list for accumulation, then convert to NumPy once at the end. Or better, preallocate the full array if you know the final size.

# 2D array operations
matrix = np.arange(6).reshape(2, 3)
new_row = np.array([[10, 11, 12]])
result = np.vstack([matrix, new_row])
print(result)

For 2D arrays, adding rows or columns is typically done with stacking functions rather than insert. Here we add a new row to a matrix using vstack. The new_row is shaped as (1, 3)—a 2D array with one row—to match the structure of the original matrix. This ensures proper alignment when stacking.

The general principle for array modification is: use insert/delete for 1D arrays when you need precise index control, and use stacking functions (vstack, hstack, concatenate) for 2D and higher arrays. These functions are more intuitive for multi-dimensional data and clearly express your intent to add rows or columns.

Practice: Array Manipulation

Task: Convert np.arange(20) into a 4x5 array.

Show Solution
import numpy as np

arr = np.arange(20).reshape(4, 5)
print(arr)
# [[ 0  1  2  3  4]
#  [ 5  6  7  8  9]
#  [10 11 12 13 14]
#  [15 16 17 18 19]]

Task: Create two 2x3 arrays and combine them both ways.

Show Solution
import numpy as np

a = np.ones((2, 3))
b = np.zeros((2, 3))

vertical = np.vstack([a, b])
print("Vertical (4x3):\n", vertical)

horizontal = np.hstack([a, b])
print("Horizontal (2x6):\n", horizontal)

Task: Split a 6x6 array into four 3x3 quadrants, then swap top-left with bottom-right.

Show Solution
import numpy as np

arr = np.arange(36).reshape(6, 6)

# Split into quadrants
top, bottom = np.vsplit(arr, 2)
tl, tr = np.hsplit(top, 2)
bl, br = np.hsplit(bottom, 2)

# Swap tl and br
new_top = np.hstack([br, tr])
new_bottom = np.hstack([bl, tl])
result = np.vstack([new_top, new_bottom])
print(result)
09

Linear Algebra

Linear algebra is the mathematics behind most machine learning algorithms. NumPy's linear algebra module provides tools for matrix operations, solving equation systems, computing eigenvectors, and more. You don't need to be a math expert to use these tools - think of them as pre-built solutions to common problems. If you're solving equations, computing eigenvalues, or working with matrices at scale, NumPy's linear algebra module is your best friend.

Why it matters: Every neural network, regression model, and dimensionality reduction algorithm relies on linear algebra. Understanding the basics will help you understand ML algorithms better.

Matrix Multiplication

import numpy as np

A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])

# Element-wise multiplication (NOT matrix multiplication)
elementwise = A * B
print("Element-wise:\n", elementwise)
# [[ 5 12]
#  [21 32]]

When you use the * operator between two NumPy arrays, you get element-wise multiplication—each element at position [i, j] in the first array is multiplied by the element at the same position in the second array. So A[0,0] × B[0,0] = 1 × 5 = 5, A[0,1] × B[0,1] = 2 × 6 = 12, and so on.

This is fundamentally different from mathematical matrix multiplication. Element-wise multiplication requires arrays of the same shape (or broadcastable shapes), while matrix multiplication follows different rules. Many beginners confuse these operations, so always be explicit about which one you need.

# Matrix multiplication (3 equivalent ways)
matmul1 = A @ B           # Recommended
matmul2 = np.matmul(A, B)
matmul3 = np.dot(A, B)
print("Matrix multiply:\n", matmul1)
# [[19 22]
#  [43 50]]

True matrix multiplication follows linear algebra rules: each element in the result is the dot product of a row from the first matrix with a column from the second. For result[0,0] = 19, we compute 1×5 + 2×7 = 5 + 14 = 19. The inner dimensions must match: an (m×n) matrix times an (n×p) matrix produces an (m×p) result.

NumPy provides three ways to perform matrix multiplication: the @ operator (most readable and recommended since Python 3.5), np.matmul() function, and np.dot(). For 2D arrays, they're equivalent. The @ operator is preferred because it clearly communicates intent and matches mathematical notation.

# Vector dot product
v1 = np.array([1, 2, 3])
v2 = np.array([4, 5, 6])
dot = np.dot(v1, v2)
print("Dot product:", dot)  # 32 = 1*4 + 2*5 + 3*6

For 1D arrays (vectors), np.dot() computes the dot product—the sum of element-wise products. It multiplies corresponding elements and adds them together: 1×4 + 2×5 + 3×6 = 4 + 10 + 18 = 32. The result is a single scalar value, not an array.

The dot product measures how similar two vectors are in direction and magnitude. It's fundamental in machine learning: computing similarities, projections, and is the core operation inside neural networks. A dot product of zero means the vectors are perpendicular (orthogonal).

Matrix Properties

import numpy as np

A = np.array([[1, 2, 3],
              [4, 5, 6],
              [7, 8, 10]])

# Determinant
det = np.linalg.det(A)
print(f"Determinant: {det:.2f}")  # -3.00

The determinant is a single number that captures important properties of a square matrix. It tells you whether the matrix is invertible (determinant ≠ 0), how the matrix scales area/volume (|det| is the scaling factor), and whether it flips orientation (negative det means reflection).

For a 2×2 matrix [[a,b],[c,d]], the determinant is ad - bc. For larger matrices, the calculation is more complex but np.linalg.det() handles it efficiently. In machine learning, determinants appear in multivariate Gaussian distributions and regularization techniques.

# Trace (sum of diagonal)
trace = np.trace(A)
print(f"Trace: {trace}")  # 16 = 1 + 5 + 10

# Rank
rank = np.linalg.matrix_rank(A)
print(f"Rank: {rank}")  # 3

The trace is simply the sum of diagonal elements (1 + 5 + 10 = 16). Despite its simplicity, trace has useful properties: it equals the sum of eigenvalues and is invariant under cyclic permutations of matrix products. It appears in regularization and optimization algorithms.

Matrix rank tells you the number of linearly independent rows (or columns). A 3×3 matrix with rank 3 has "full rank"—all rows/columns are independent. If rank is less than matrix dimensions, some rows/columns are linear combinations of others, which affects whether systems of equations have unique solutions.

# Inverse
A_inv = np.linalg.inv(A)
print("Inverse:\n", A_inv.round(2))

# Verify: A @ A_inv = Identity
identity = A @ A_inv
print("A @ A_inv:\n", identity.round(2))

The matrix inverse A⁻¹ is the matrix that, when multiplied by A, produces the identity matrix (1s on diagonal, 0s elsewhere). Only square matrices with non-zero determinant have inverses. Computing A @ A_inv should give the identity matrix, verifying the inverse is correct.

While you can solve Ax = b using x = A⁻¹ @ b, this is generally not recommended. Computing the inverse is expensive and can introduce numerical errors. For solving equations, use np.linalg.solve() instead—it's faster and more numerically stable.

Solving Linear Equations

import numpy as np

# Solve: Ax = b
# 2x + 3y = 8
# 4x + 5y = 14

A = np.array([[2, 3],
              [4, 5]])
b = np.array([8, 14])

Systems of linear equations appear everywhere in science and engineering. Here we have two equations with two unknowns (x and y). In matrix form, this becomes Ax = b, where A contains the coefficients, x is the vector of unknowns, and b contains the constants on the right side.

Setting up the problem correctly is crucial: each row of matrix A corresponds to one equation's coefficients, and the corresponding element in b is that equation's constant. The order of variables must be consistent across all equations.

x = np.linalg.solve(A, b)
print(f"Solution: x={x[0]}, y={x[1]}")  # x=1, y=2

np.linalg.solve() finds the values of unknowns that satisfy all equations simultaneously. The function uses efficient algorithms (LU decomposition internally) that are both faster and more numerically stable than computing A⁻¹ and multiplying by b.

The solution tells us x=1 and y=2 satisfy both equations. You could verify by hand: 2(1) + 3(2) = 8 ✓ and 4(1) + 5(2) = 14 ✓. This technique scales to systems with hundreds or thousands of equations—problems that would be impossible to solve by hand.

# Verify
print("Verification:", A @ x)  # [8. 14.]

Always verify your solution by computing A @ x and checking it equals b. Due to floating-point arithmetic, you might get values like [7.9999999, 14.0000001] instead of exact integers—this is normal and acceptable for numerical computing.

If np.linalg.solve() raises a "Singular matrix" error, it means the system has no unique solution—either no solution exists (inconsistent equations) or infinitely many solutions exist (dependent equations). Check your matrix rank or use least-squares methods for such cases.

Eigenvalues and Eigenvectors

import numpy as np

A = np.array([[4, 2],
              [1, 3]])

Eigenvalues and eigenvectors reveal the fundamental behavior of a matrix transformation. When a matrix multiplies an eigenvector, the result is simply the eigenvector scaled by the eigenvalue—the direction doesn't change, only the magnitude. This special property makes eigenvectors the "natural axes" of the transformation.

Understanding this concept is essential for Principal Component Analysis (PCA), Google's PageRank algorithm, quantum mechanics, and many machine learning techniques. The eigenvectors point in directions of maximum variance or importance in your data.

# Compute eigenvalues and eigenvectors
eigenvalues, eigenvectors = np.linalg.eig(A)

print("Eigenvalues:", eigenvalues)   # [5. 2.]
print("Eigenvectors:\n", eigenvectors)

np.linalg.eig() returns two arrays: eigenvalues (a 1D array of scalars) and eigenvectors (a 2D array where each COLUMN is an eigenvector). This column-based format is important—eigenvectors[:, 0] gives the first eigenvector corresponding to eigenvalues[0].

The eigenvalues [5, 2] tell us the matrix stretches by a factor of 5 along one eigenvector direction and by a factor of 2 along the other. In PCA, you'd typically sort eigenvalues in descending order and use the eigenvectors corresponding to the largest eigenvalues as your principal components.

# Verify: A @ v = lambda * v
for i in range(len(eigenvalues)):
    v = eigenvectors[:, i]
    lam = eigenvalues[i]
    print(f"A @ v{i}: {(A @ v).round(3)}")
    print(f"λ * v{i}: {(lam * v).round(3)}")

The defining equation for eigenvectors is Av = λv, where A is the matrix, v is an eigenvector, and λ (lambda) is the corresponding eigenvalue. This verification loop confirms that multiplying the matrix by each eigenvector produces the same result as scaling the eigenvector by its eigenvalue.

When both expressions produce identical results (within floating-point precision), you've confirmed the eigenvector relationship. This property is what makes eigenvectors so useful—complex matrix operations reduce to simple scalar multiplication along these special directions.

Norms and Distances

import numpy as np

v = np.array([3, 4])

# Vector norms
l2_norm = np.linalg.norm(v)       # Euclidean: sqrt(3^2 + 4^2) = 5
print(f"L2 (Euclidean): {l2_norm}")

The L2 norm (also called Euclidean norm) measures the straight-line length of a vector. For a 2D vector [3, 4], it's √(3² + 4²) = √(9 + 16) = √25 = 5. This is the classic distance formula you learned in geometry—the hypotenuse of a right triangle.

By default, np.linalg.norm() computes the L2 norm because it's the most commonly used. It measures the "magnitude" of a vector and is fundamental to normalizing vectors (making them unit length), computing distances, and many machine learning loss functions.

l1_norm = np.linalg.norm(v, ord=1) # Manhattan: |3| + |4| = 7
linf_norm = np.linalg.norm(v, ord=np.inf)  # Max: max(|3|, |4|) = 4

print(f"L1 (Manhattan): {l1_norm}")
print(f"L∞ (Max): {linf_norm}")

The L1 norm (Manhattan distance) sums absolute values: |3| + |4| = 7. It's called Manhattan distance because it measures distance as if walking on a grid of city blocks—you can only move horizontally or vertically, not diagonally. L1 regularization in ML (Lasso) promotes sparsity by driving some weights to exactly zero.

The L∞ norm (infinity norm or max norm) returns the largest absolute value: max(|3|, |4|) = 4. It measures the largest component of the vector. In ML, L∞ bounds ensure no single weight grows too large, providing different regularization properties than L1 or L2.

# Distance between points
p1 = np.array([1, 2])
p2 = np.array([4, 6])
distance = np.linalg.norm(p2 - p1)
print(f"Distance: {distance}")  # 5.0

To find the distance between two points, compute the vector from one to the other (p2 - p1), then take its norm. Here, p2 - p1 = [4-1, 6-2] = [3, 4], and we already know the norm of [3, 4] is 5. This is the standard Euclidean distance formula: √((x₂-x₁)² + (y₂-y₁)²).

This pattern extends to any number of dimensions. In machine learning, you'll compute distances between feature vectors to find nearest neighbors, measure clustering quality, or compute similarity scores. The same formula works for 2D, 3D, or 1000-dimensional spaces.

Linear Algebra Reference

Function Description Example
A @ BMatrix multiplicationnp.array([[1,2]]) @ np.array([[3],[4]])
np.dot(a, b)Dot product / matrix multiplynp.dot([1,2], [3,4])
np.linalg.det(A)Determinantnp.linalg.det([[1,2],[3,4]])
np.linalg.inv(A)Matrix inversenp.linalg.inv(A)
np.linalg.solve(A, b)Solve Ax = bnp.linalg.solve(A, b)
np.linalg.eig(A)Eigenvalues and eigenvectorsvals, vecs = np.linalg.eig(A)
np.linalg.norm(v)Vector/matrix normnp.linalg.norm([3, 4])
np.linalg.svd(A)Singular value decompositionU, s, Vh = np.linalg.svd(A)
np.trace(A)Sum of diagonalnp.trace([[1,2],[3,4]])
np.linalg.matrix_rank(A)Matrix ranknp.linalg.matrix_rank(A)

Practice: Linear Algebra

Task: Multiply a 3x3 matrix by a 3-element vector.

Show Solution
import numpy as np

A = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
v = np.array([1, 0, 1])
result = A @ v
print(result)  # [ 4 10 16]

Task: Solve: 3x + 2y = 12, x + 4y = 10

Show Solution
import numpy as np

A = np.array([[3, 2], [1, 4]])
b = np.array([12, 10])
solution = np.linalg.solve(A, b)
print(f"x = {solution[0]:.2f}, y = {solution[1]:.2f}")
# x = 2.80, y = 1.80

Task: Calculate the inverse of a 2x2 matrix and verify A @ A_inv = I.

Show Solution
import numpy as np

A = np.array([[4, 7], [2, 6]])
A_inv = np.linalg.inv(A)
print("Inverse:\n", A_inv.round(3))

# Verify
identity = A @ A_inv
print("A @ A_inv:\n", identity.round(3))
# [[1. 0.]
#  [0. 1.]]

Task: Calculate the Euclidean distance between points (1,2,3) and (4,5,6).

Show Solution
import numpy as np

p1 = np.array([1, 2, 3])
p2 = np.array([4, 5, 6])

# Method 1: Using norm
distance = np.linalg.norm(p2 - p1)
print(f"Distance: {distance:.4f}")  # 5.1962

# Method 2: Manual calculation
distance = np.sqrt(np.sum((p2 - p1)**2))
print(f"Distance: {distance:.4f}")  # 5.1962
10

Interactive Demo

Explore NumPy operations with this interactive reference. Select an operation to see its syntax and example output.

NumPy Operation Explorer
# Select an operation to see example code

Common Use Cases

Data Preprocessing
# Normalize data
data = (data - data.mean()) / data.std()

# Handle missing values
data[np.isnan(data)] = 0

# Clip outliers
data = np.clip(data, -3, 3)
Machine Learning
# Train/test split
n = len(data)
train = data[:int(0.8*n)]
test = data[int(0.8*n):]

# Feature scaling
X = (X - X.min()) / (X.max() - X.min())
Image Processing
# Grayscale conversion
gray = 0.299*R + 0.587*G + 0.114*B

# Resize by averaging
small = img.reshape(h//2, 2, w//2, 2).mean((1,3))

Debug Checklist

Common Errors and Fixes
Error Likely Cause Solution
ValueError: cannot reshape Total elements don't match Check arr.size equals product of new dimensions
ValueError: operands could not be broadcast Incompatible shapes Check shapes with arr.shape, use reshape or newaxis
IndexError: index out of bounds Index exceeds array size Check arr.shape, use negative indexing if needed
TypeError: ufunc doesn't support dtype Wrong data type Use arr.astype(np.float64)
LinAlgError: Singular matrix Matrix has no inverse Check determinant != 0, use pseudoinverse instead

Best Practices

Do
  • Use vectorized operations instead of loops
  • Preallocate arrays when size is known
  • Use appropriate dtypes to save memory
  • Use views instead of copies when possible
  • Set random seed for reproducibility
  • Use broadcasting for efficient operations
Don't
  • Don't use Python loops for array operations
  • Don't grow arrays with append in loops
  • Don't confuse * (element-wise) with @ (matrix)
  • Don't ignore dtype mismatches
  • Don't forget axis parameter in aggregations
  • Don't modify arrays while iterating

Common Mistakes

Mistake Problem Solution
a * b for matrices Element-wise, not matrix multiply Use a @ b or np.matmul(a, b)
arr.sum() vs arr.sum(axis=0) Sums everything vs. by column Always specify axis for 2D arrays
Modifying a slice Changes original array (view) Use .copy() for independent copy
arr == np.nan NaN comparisons are always False Use np.isnan(arr)
Integer division Truncates decimals Use float dtype or true division
11

Quick Reference

A comprehensive cheat sheet of the most commonly used NumPy functions and patterns.

Array Creation

Function Description Example
np.array()Create from listnp.array([1, 2, 3])
np.zeros()Array of zerosnp.zeros((3, 4))
np.ones()Array of onesnp.ones((2, 3))
np.arange()Range arraynp.arange(0, 10, 2)
np.linspace()Evenly spacednp.linspace(0, 1, 5)
np.eye()Identity matrixnp.eye(3)
np.random.rand()Random 0-1np.random.rand(3, 3)
np.random.randint()Random integersnp.random.randint(0, 10, (2, 3))
np.full()Fill with valuenp.full((2, 3), 7)
np.empty()Uninitializednp.empty((2, 2))

Indexing and Slicing

Syntax Description Example
arr[i]Single elementarr[0]
arr[-1]Last elementarr[-1]
arr[start:stop]Slicearr[1:4]
arr[::step]Every ntharr[::2]
arr[row, col]2D elementarr[1, 2]
arr[:, col]Columnarr[:, 0]
arr[row, :]Rowarr[0, :]
arr[condition]Boolean filterarr[arr > 5]
arr[[i, j, k]]Fancy indexingarr[[0, 2, 4]]

Aggregations

Function Description Axis Behavior
sum()Sum of elementsaxis=0: columns, axis=1: rows
mean()Averageaxis=0: columns, axis=1: rows
std()Standard deviationaxis=0: columns, axis=1: rows
min(), max()Extremesaxis=0: columns, axis=1: rows
argmin(), argmax()Index of extremeaxis=0: columns, axis=1: rows
cumsum()Cumulative sumaxis=0: down, axis=1: across
prod()Productaxis=0: columns, axis=1: rows

Ready-to-Use Templates

import numpy as np

def normalize_data(data, method='minmax'):
    """Normalize data using min-max or z-score method."""
    if method == 'minmax':
        return (data - data.min()) / (data.max() - data.min())
    elif method == 'zscore':
        return (data - data.mean()) / data.std()
    else:
        raise ValueError("Method must be 'minmax' or 'zscore'")

# Example usage
data = np.random.rand(100) * 100
normalized = normalize_data(data, 'minmax')
print(f"Range: [{normalized.min():.2f}, {normalized.max():.2f}]")

import numpy as np

# Create sample grayscale image (50x50)
image = np.random.randint(0, 256, (50, 50), dtype=np.uint8)

# Basic operations
inverted = 255 - image
brightened = np.clip(image + 50, 0, 255).astype(np.uint8)
contrast = np.clip((image - 128) * 1.5 + 128, 0, 255).astype(np.uint8)

# Downscale by averaging 2x2 blocks
downscaled = image.reshape(25, 2, 25, 2).mean(axis=(1, 3))

print(f"Original shape: {image.shape}")
print(f"Downscaled shape: {downscaled.shape}")

import numpy as np

def analyze_data(data):
    """Compute comprehensive statistics for a dataset."""
    stats = {
        'count': len(data),
        'mean': np.mean(data),
        'median': np.median(data),
        'std': np.std(data),
        'var': np.var(data),
        'min': np.min(data),
        'max': np.max(data),
        'range': np.ptp(data),
        'q25': np.percentile(data, 25),
        'q75': np.percentile(data, 75),
    }
    stats['iqr'] = stats['q75'] - stats['q25']
    return stats

# Example usage
data = np.random.normal(100, 15, 1000)
results = analyze_data(data)
for key, value in results.items():
    print(f"{key}: {value:.2f}")

import numpy as np

def pairwise_distances(points):
    """Calculate pairwise Euclidean distances between points."""
    # points: (n_samples, n_features)
    # Use broadcasting: ||a - b||^2 = ||a||^2 + ||b||^2 - 2*a·b
    sq_norms = np.sum(points**2, axis=1)
    distances = np.sqrt(sq_norms[:, np.newaxis] + sq_norms - 2 * points @ points.T)
    return distances

# Example: 4 points in 2D
points = np.array([[0, 0], [1, 0], [0, 1], [1, 1]])
dist_matrix = pairwise_distances(points)
print(dist_matrix.round(3))
# [[0.    1.    1.    1.414]
#  [1.    0.    1.414 1.   ]
#  [1.    1.414 0.    1.   ]
#  [1.414 1.    1.    0.   ]]

import numpy as np

def one_hot_encode(labels, n_classes=None):
    """Convert integer labels to one-hot encoding."""
    labels = np.array(labels)
    if n_classes is None:
        n_classes = labels.max() + 1
    one_hot = np.zeros((len(labels), n_classes))
    one_hot[np.arange(len(labels)), labels] = 1
    return one_hot

# Example usage
labels = np.array([0, 1, 2, 1, 0])
encoded = one_hot_encode(labels, n_classes=3)
print(encoded)
# [[1. 0. 0.]
#  [0. 1. 0.]
#  [0. 0. 1.]
#  [0. 1. 0.]
#  [1. 0. 0.]]

import numpy as np

def process_in_batches(data, batch_size, func):
    """Process large array in batches to manage memory."""
    results = []
    n_samples = len(data)
    
    for i in range(0, n_samples, batch_size):
        batch = data[i:i + batch_size]
        result = func(batch)
        results.append(result)
    
    return np.concatenate(results)

# Example: Square large array in batches
data = np.arange(1000000)
squared = process_in_batches(data, batch_size=100000, func=lambda x: x**2)
print(f"Processed {len(squared)} elements")
print(f"First 5: {squared[:5]}")  # [0 1 4 9 16]

Performance Tips

Speed Optimizations
  • Preallocate: Create arrays with np.zeros() instead of append
  • Contiguous: Use np.ascontiguousarray() for cache efficiency
  • In-place: Use += instead of + when possible
  • Vectorize: Replace loops with array operations
  • Views: Slicing creates views, not copies
Memory Management
  • dtype: Use float32 instead of float64 if precision allows
  • Batching: Process large arrays in chunks
  • Del: Delete large arrays when done with del arr
  • Memory map: np.memmap for huge files
  • Sparse: Use scipy.sparse for sparse matrices

Key Takeaways

NumPy is Fast

Arrays are 10-100x faster than lists for numerical operations.

Vectorized Operations

Operations apply to entire arrays without explicit loops.

Shape Matters

Understand shape, dtype, and ndim to manipulate arrays correctly.

Boolean Indexing

Filter arrays with conditions like arr[arr > 0].

Broadcasting

Smaller arrays automatically expand to match larger ones.

Use axis Parameter

axis=0 for columns, axis=1 for rows in aggregations.

@ for Matrix Multiply

Use @ for matrix multiplication, * is element-wise.

Views vs Copies

Slicing creates views. Use .copy() for independent copies.

Set Random Seed

np.random.seed(42) ensures reproducible random results.

Knowledge Check

Quick Quiz

Test what you've learned about NumPy arrays and operations

1 What makes NumPy arrays faster than Python lists?
2 What does arr.shape return for a 2D array?
3 How do you get all elements greater than 5 from an array?
4 What does axis=0 mean in arr.sum(axis=0)?
5 What happens when you broadcast a (3,1) array with a (1,4) array?
6 What is the difference between A * B and A @ B for matrices?
Answer all questions to check your score