Module 3.2

Array Operations

Unlock the power of vectorized operations! Learn element-wise math, broadcasting rules, universal functions, and aggregations that make NumPy blazing fast.

35 min read
Beginner
Hands-on Examples
What You'll Learn
  • Element-wise operations
  • Broadcasting rules and patterns
  • Universal functions (ufuncs)
  • Aggregation functions (sum, mean, std)
  • Boolean indexing and filtering
Contents
01

Element-wise Operations

Element-wise operations are NumPy's superpower. Instead of writing loops to process each element, you can apply operations to entire arrays at once. This is called vectorization, and it's why NumPy is so fast.

Arithmetic Operations

Standard arithmetic operators (+, -, *, /, **) work element-by-element on arrays:

import numpy as np

a = np.array([1, 2, 3, 4])
b = np.array([10, 20, 30, 40])

Addition and subtraction work on corresponding elements:

print(a + b)    # [11 22 33 44]
print(b - a)    # [ 9 18 27 36]

Multiplication and division are also element-wise (not matrix operations):

print(a * b)    # [ 10  40  90 160]
print(b / a)    # [10. 10. 10. 10.]

Power operator raises each element to the power:

print(a ** 2)   # [ 1  4  9 16]
Key Insight: No loops needed! Each operation is applied to every element simultaneously. This is vectorization - the key to fast numerical computing.
alt="NumPy Array Operations: Element-wise, Universal Functions, Aggregation, and Axis Operations" class="figure-img img-fluid rounded shadow-lg">
NumPy operations: element-wise (+), universal functions (sqrt), aggregation (sum), and axis operations

Operations with Scalars

When you combine an array with a single value (scalar), the operation is applied to every element:

arr = np.array([1, 2, 3, 4, 5])

print(arr + 10)     # [11 12 13 14 15]
print(arr * 2)      # [ 2  4  6  8 10]

A common pattern is normalizing data to a 0-1 range:

# Normalize data to 0-1 range
data = np.array([100, 200, 300, 400, 500])
normalized = (data - data.min()) / (data.max() - data.min())
print(normalized)   # [0.   0.25 0.5  0.75 1.  ]

Comparison Operations

Comparisons also work element-wise, returning boolean arrays:

a = np.array([1, 2, 3, 4, 5])
b = np.array([5, 4, 3, 2, 1])

You can compare arrays to each other or to a scalar:

print(a > b)     # [False False False  True  True]
print(a == b)    # [False False  True False False]
print(a >= 3)    # [False False  True  True  True]

Boolean arrays can be used for filtering:

# Use boolean array for indexing
print(a[a > 2])  # [3 4 5]

Operations on 2D Arrays

Element-wise operations work the same way for multi-dimensional arrays:

A = np.array([[1, 2], 
              [3, 4]])
B = np.array([[10, 20], 
              [30, 40]])

Element-wise addition adds corresponding elements:

print(A + B)
# [[11 22]
#  [33 44]]

Element-wise multiplication multiplies corresponding elements (this is NOT matrix multiplication!):

print(A * B)
# [[ 10  40]
#  [ 90 160]]
# [ 9 16]]
Important: A * B is element-wise multiplication. For matrix multiplication, use A @ B or np.dot(A, B). We'll cover this in Topic 3.3.

Practice Questions: Element-wise Operations

Test your understanding of array operations.

Given:

prices = np.array([10, 25, 50, 100])

Task: Double all prices.

Expected output: [ 20 50 100 200]

Show Solution
prices = np.array([10, 25, 50, 100])
doubled = prices * 2
print(doubled)  # [ 20  50 100 200]

Given:

old_prices = np.array([100, 200, 150])
new_prices = np.array([120, 180, 165])

Task: Calculate percentage change for each item.

Formula: ((new - old) / old) * 100

Show Solution
pct_change = ((new_prices - old_prices) / old_prices) * 100
print(pct_change)  # [ 20. -10.  10.]

Given:

data = np.array([20, 50, 80, 30, 100])

Task: Normalize values to range [0, 1].

Formula: (x - min) / (max - min)

Show Solution
data = np.array([20, 50, 80, 30, 100])
normalized = (data - data.min()) / (data.max() - data.min())
print(normalized)  # [0.    0.375 0.75  0.125 1.   ]

Given:

scores = np.array([45, 72, 88, 55, 91, 67, 83])

Task: Find all scores between 60 and 85 (inclusive).

Hint: Use & with parentheses around each condition

Show Solution
scores = np.array([45, 72, 88, 55, 91, 67, 83])
in_range = scores[(scores >= 60) & (scores <= 85)]
print(in_range)  # [72 67 83]

Given:

point1 = np.array([1, 2])
point2 = np.array([4, 6])

Task: Calculate Euclidean distance between points.

Formula: sqrt((x2-x1)² + (y2-y1)²)

Show Solution
point1 = np.array([1, 2])
point2 = np.array([4, 6])
diff = point2 - point1
distance = np.sqrt(np.sum(diff ** 2))
print(distance)  # 5.0
02

Broadcasting

Broadcasting is NumPy's powerful mechanism for working with arrays of different shapes. It allows operations between arrays that don't have the same dimensions by automatically "stretching" the smaller array.

What is Broadcasting?

When you add a scalar to an array, NumPy "broadcasts" the scalar across all elements. But broadcasting works with arrays of different shapes too:

Interactive: Broadcasting Visualizer

Explore!

Click different scenarios to see how NumPy broadcasts arrays of different shapes.

Array A

1
2
3

Shape: (3,)

Array B

10

Shape: scalar

Result

11
12
13

Shape: (3,)

Scalar Broadcasting: The scalar 10 is virtually expanded to [10, 10, 10] to match the array shape. No extra memory is used.

import numpy as np

# Simple broadcasting: scalar to array
arr = np.array([1, 2, 3])
print(arr + 10)  # [11 12 13]
# The scalar 10 is "broadcast" to [10, 10, 10]

# Broadcasting: 1D array with 2D array
matrix = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 8, 9]])
row = np.array([10, 20, 30])

print(matrix + row)
# [[11 22 33]
#  [14 25 36]
#  [17 28 39]]
# The row is broadcast across all rows of the matrix
Without Broadcasting

You'd need to manually replicate the row 3 times, then add.

With Broadcasting

NumPy automatically handles the replication - no extra memory!

Broadcasting Rules

NumPy compares shapes from right to left. Two dimensions are compatible when:

  1. They are equal, or
  2. One of them is 1
Array A Array B Result Why?
(3, 4) (4,) (3, 4) ✓ (4) == (4), (3) broadcasts with missing dim
(3, 4) (3, 1) (3, 4) ✓ (4) broadcasts with (1), (3) == (3)
(3, 4) (3,) Error ✗ (4) != (3) and neither is 1
(2, 3, 4) (3, 4) (2, 3, 4) ✓ Last two dims match
(2, 3, 4) (1,) (2, 3, 4) ✓ (1) broadcasts with everything

Practical Broadcasting Examples

Example 1: Subtract the column mean from each column. The column means have shape (3,) which broadcasts with (3, 3):

data = np.array([[1, 2, 3],
                 [4, 5, 6],
                 [7, 8, 9]])

col_means = data.mean(axis=0)  # [4. 5. 6.] - shape (3,)
centered = data - col_means
print(centered)
# [[-3. -3. -3.]
#  [ 0.  0.  0.]
#  [ 3.  3.  3.]]

Example 2: Subtract the row mean from each row. Use keepdims=True to get shape (3, 1) instead of (3,):

row_means = data.mean(axis=1, keepdims=True)  # [[2.], [5.], [8.]]
centered = data - row_means
print(centered)
# [[-1.  0.  1.]
#  [-1.  0.  1.]
#  [-1.  0.  1.]]

Example 3: Compute an outer product using broadcasting. Adding a dimension with np.newaxis changes shape from (3,) to (3, 1):

a = np.array([1, 2, 3])[:, np.newaxis]  # Shape (3, 1)
b = np.array([10, 20, 30])              # Shape (3,)

print(a * b)  # (3, 1) * (3,) = (3, 3)
# [[ 10  20  30]
#  [ 20  40  60]
#  [ 30  60  90]]
Tip: Use keepdims=True in aggregations to preserve dimensions for broadcasting. Use np.newaxis or reshape() to add dimensions.

Practice Questions: Broadcasting

Test your understanding of broadcasting rules.

Task: Determine if these shapes are broadcast-compatible and what the result shape would be:

  • (5, 3) and (3,)
  • (4, 1) and (3,)
  • (2, 3) and (3, 2)
Show Solution

(5, 3) + (3,) → ✓ Result: (5, 3)

(4, 1) + (3,) → ✓ Result: (4, 3)

(2, 3) + (3, 2) → ✗ Error! Neither dimension matches

Task: Normalize each row of this matrix by dividing by the row sum:

data = np.array([[1, 2, 3],
                 [4, 5, 6]])
Show Solution
row_sums = data.sum(axis=1, keepdims=True)
normalized = data / row_sums
print(normalized)
# [[0.167 0.333 0.5  ]
#  [0.267 0.333 0.4  ]]
03

Universal Functions (ufuncs)

Universal functions (ufuncs) are NumPy's optimized functions that operate on arrays element-by-element. They're implemented in compiled C code, making them incredibly fast.

What are Universal Functions?

When you use np.sqrt(arr) or np.sin(arr), you're using ufuncs. They automatically apply to every element without explicit loops:

import numpy as np

arr = np.array([1, 4, 9, 16, 25])

# ufunc example: square root
print(np.sqrt(arr))  # [1. 2. 3. 4. 5.]

# Much faster than Python loops
# Slow way (don't do this):
# result = [math.sqrt(x) for x in arr]

# Fast way (ufunc):
result = np.sqrt(arr)

Mathematical ufuncs

Category Functions Description
Basic Math add, subtract, multiply, divide, power Arithmetic operations
Rounding floor, ceil, round, trunc Round to nearest integer
Exponential exp, exp2, log, log2, log10 Exponentials and logarithms
Trigonometric sin, cos, tan, arcsin, arccos, arctan Trig functions (radians)
Absolute abs, absolute, fabs Absolute value
Signs sign, negative, positive Sign operations
# Examples of common ufuncs
arr = np.array([-2.5, -1.2, 0, 1.7, 3.9])

print(np.abs(arr))      # [2.5 1.2 0.  1.7 3.9]
print(np.floor(arr))    # [-3. -2.  0.  1.  3.]
print(np.ceil(arr))     # [-2. -1.  0.  2.  4.]
print(np.round(arr))    # [-2. -1.  0.  2.  4.]
print(np.sign(arr))     # [-1. -1.  0.  1.  1.]

Exponentials and logarithms are essential for data science calculations like normalizing values or working with probability distributions:

arr = np.array([1, 2, 3])
print(np.exp(arr))      # [ 2.718  7.389 20.086]
print(np.log(arr))      # [0.    0.693 1.099] (natural log)
print(np.log10(arr))    # [0.    0.301 0.477] (base 10)

Trigonometric functions work with angles in radians. These are used frequently in signal processing and feature engineering:

angles = np.array([0, np.pi/4, np.pi/2, np.pi])
print(np.sin(angles))   # [0.    0.707 1.    0.   ]
print(np.cos(angles))   # [1.    0.707 0.   -1.   ]

Binary ufuncs (Two Inputs)

Binary ufuncs take two arrays as input and compute element-wise results:

a = np.array([1, 5, 10])
b = np.array([2, 3, 4])

np.maximum() and np.minimum() compare element-wise:

print(np.maximum(a, b))  # [ 2  5 10]
print(np.minimum(a, b))  # [1 3 4]

Other useful binary ufuncs include power, modulo, and comparisons:

print(np.power(a, b))    # [   1  125 10000]
print(np.mod(a, b))      # [1 2 2]
print(np.greater(a, b))  # [False  True  True]

The out Parameter

For memory efficiency, you can specify an output array instead of creating a new one. Without out, NumPy creates a new array:

arr = np.array([1, 2, 3, 4])
result = np.square(arr)
print(result)  # [ 1  4  9 16]

With out, NumPy writes to an existing array (no new allocation):

output = np.empty(4)
np.square(arr, out=output)
print(output)  # [ 1.  4.  9. 16.]

You can even do in-place operations by setting the output to the same array:

np.square(arr, out=arr)  # Modify arr directly
print(arr)  # [ 1  4  9 16]
Performance Tip: Using out= avoids memory allocation, which can speed up tight loops significantly. Especially useful in iterative algorithms.

Practice Questions: Universal Functions

Apply ufuncs to solve these exercises.

Task: Create an array [1, 4, 9, 16, 25] and compute the square root of each element.

Show Solution
arr = np.array([1, 4, 9, 16, 25])
print(np.sqrt(arr))  # [1. 2. 3. 4. 5.]

Given:

a = np.array([3, 7, 2, 8])
b = np.array([5, 1, 9, 4])

Task: Find the element-wise maximum of the two arrays.

Show Solution
result = np.maximum(a, b)
print(result)  # [5 7 9 8]
04

Aggregations

Aggregation functions reduce arrays to single values (or smaller arrays). They're essential for computing statistics: sum, mean, max, min, standard deviation, and more.

Basic Aggregation Functions

import numpy as np
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])

Calculate sum and product of all elements:

print(np.sum(arr))      # 55
print(np.prod(arr))     # 3628800

Find minimum and maximum values:

print(np.min(arr))      # 1
print(np.max(arr))      # 10

Find the index of min and max values:

print(np.argmin(arr))   # 0 (index of minimum)
print(np.argmax(arr))   # 9 (index of maximum)

Calculate statistics like mean, median, and standard deviation:

print(np.mean(arr))     # 5.5
print(np.median(arr))   # 5.5
print(np.std(arr))      # 2.872 (standard deviation)
Function Description Method Syntax
np.sum()Sum of elementsarr.sum()
np.prod()Product of elementsarr.prod()
np.mean()Arithmetic meanarr.mean()
np.std()Standard deviationarr.std()
np.var()Variancearr.var()
np.min()Minimum valuearr.min()
np.max()Maximum valuearr.max()
np.argmin()Index of minimumarr.argmin()
np.argmax()Index of maximumarr.argmax()
np.cumsum()Cumulative sumarr.cumsum()
np.cumprod()Cumulative productarr.cumprod()

Aggregating Along Axes

For multi-dimensional arrays, use the axis parameter to aggregate along specific dimensions:

arr = np.array([[1, 2, 3],
                [4, 5, 6],
                [7, 8, 9]])

Without specifying axis, the function aggregates all elements:

# No axis: aggregate all elements
print(np.sum(arr))  # 45 (sum of all)

With axis=0, aggregate down columns (collapse rows):

# axis=0: sum each column
print(np.sum(arr, axis=0))  # [12 15 18]

With axis=1, aggregate across rows (collapse columns):

# axis=1: sum each row
print(np.sum(arr, axis=1))  # [ 6 15 24]
axis=0

Collapse rows → result has one value per column. "Sum down each column"

axis=1

Collapse columns → result has one value per row. "Sum across each row"

Use keepdims=True to preserve dimensions for broadcasting:

# More examples with axis
print(np.mean(arr, axis=0))  # [4. 5. 6.] (column means)
print(np.mean(arr, axis=1))  # [2. 5. 8.] (row means)
print(np.max(arr, axis=0)) # [7 8 9] (max in each column) print(np.max(arr, axis=1)) # [3 6 9] (max in each row) # Keepdims preserves dimensions (useful for broadcasting) col_means = np.mean(arr, axis=0, keepdims=True) print(col_means.shape) # (1, 3) instead of (3,)

Boolean Aggregations

arr = np.array([1, 2, 3, 4, 5])

Check if any element satisfies a condition:

print(np.any(arr > 3))  # True (4 and 5 are > 3)

Check if all elements satisfy a condition:

print(np.all(arr > 0))  # True (all positive)
print(np.all(arr > 3))  # False (1, 2, 3 are not)

Count how many elements satisfy a condition:

# Sum of True values = count
print(np.sum(arr > 2))   # 3 (three values > 2)

# Mean of True values = percentage
print(np.mean(arr > 2))  # 0.6 (60% are > 2)

Cumulative Operations

Cumulative operations compute running totals:

arr = np.array([1, 2, 3, 4, 5])

print(np.cumsum(arr))   # [ 1  3  6 10 15]
print(np.cumprod(arr))  # [  1   2   6  24 120]

Practical example - tracking running sum of sales:

daily_sales = np.array([100, 150, 200, 180, 220])
total_to_date = np.cumsum(daily_sales)
print(total_to_date)  # [100 250 450 630 850]
Memory Tip: Method syntax (arr.sum()) and function syntax (np.sum(arr)) do the same thing. Use whichever you prefer!

Practice Questions: Aggregations

Master statistical operations with these exercises.

Given:

scores = np.array([85, 92, 78, 95, 88, 76, 90])

Task: Find the mean, median, and standard deviation.

Show Solution
scores = np.array([85, 92, 78, 95, 88, 76, 90])
print(f"Mean: {np.mean(scores):.2f}")    # 86.29
print(f"Median: {np.median(scores)}")    # 88.0
print(f"Std: {np.std(scores):.2f}")      # 6.74

Given:

temps = np.array([23, 28, 31, 25, 29, 27])

Task: Find the index of the hottest day.

Show Solution
temps = np.array([23, 28, 31, 25, 29, 27])
hottest_day = np.argmax(temps)
print(f"Hottest day index: {hottest_day}")  # 2
print(f"Temperature: {temps[hottest_day]}")  # 31

Given:

sales = np.array([[100, 200, 150],
                  [120, 180, 160],
                  [110, 220, 140]])

Task: Find the total sales for each product (columns).

Expected output: [330 600 450]

Show Solution
product_totals = np.sum(sales, axis=0)
print(product_totals)  # [330 600 450]

Given:

grades = np.array([72, 85, 90, 65, 88, 78, 92, 58, 81])

Task: What percentage of students scored above 80?

Show Solution
grades = np.array([72, 85, 90, 65, 88, 78, 92, 58, 81])
pct_above_80 = np.mean(grades > 80) * 100
print(f"{pct_above_80:.1f}%")  # 55.6%

Given:

daily_revenue = np.array([500, 800, 650, 920, 750, 1100, 890])

Task: Calculate the running total and find on which day (index) it first exceeds 3000.

Show Solution
daily_revenue = np.array([500, 800, 650, 920, 750, 1100, 890])
running_total = np.cumsum(daily_revenue)
print(running_total)  # [ 500 1300 1950 2870 3620 4720 5610]
day = np.argmax(running_total > 3000)
print(f"Day {day}: {running_total[day]}")  # Day 4: 3620

Key Takeaways

Vectorization Power

Operations like arr * 2 apply to all elements without explicit loops - this is what makes NumPy fast

Element-wise Math

Standard operators +, -, *, /, ** work element-by-element on arrays

Broadcasting Rules

Arrays of different shapes can work together. Dimensions must be equal or one must be 1

Universal Functions

ufuncs are optimized C functions: np.sqrt(), np.exp(), np.sin(), and many more

Aggregations

Reduce arrays with sum(), mean(), std(), min(), max()

Axis Parameter

axis=0 aggregates along rows (column result), axis=1 along columns (row result)

Knowledge Check

Test your understanding of array operations:

1 What does np.array([1,2,3]) * np.array([4,5,6]) return?
2 Can a (3, 4) array broadcast with a (4,) array?
3 What type of function is np.sqrt()?
4 For a (3, 4) array, what shape does np.sum(arr, axis=0) return?
5 How do you count how many elements in arr are greater than 5?
6 What does np.cumsum([1, 2, 3, 4]) return?
0/6 answered