Element-wise Operations
Element-wise operations are NumPy's superpower. Instead of writing loops to process each element, you can apply operations to entire arrays at once. This is called vectorization, and it's why NumPy is so fast.
Arithmetic Operations
Standard arithmetic operators (+, -, *,
/, **) work element-by-element on arrays:
import numpy as np
a = np.array([1, 2, 3, 4])
b = np.array([10, 20, 30, 40])
Addition and subtraction work on corresponding elements:
print(a + b) # [11 22 33 44]
print(b - a) # [ 9 18 27 36]
Multiplication and division are also element-wise (not matrix operations):
print(a * b) # [ 10 40 90 160]
print(b / a) # [10. 10. 10. 10.]
Power operator raises each element to the power:
print(a ** 2) # [ 1 4 9 16]
Operations with Scalars
When you combine an array with a single value (scalar), the operation is applied to every element:
arr = np.array([1, 2, 3, 4, 5])
print(arr + 10) # [11 12 13 14 15]
print(arr * 2) # [ 2 4 6 8 10]
A common pattern is normalizing data to a 0-1 range:
# Normalize data to 0-1 range
data = np.array([100, 200, 300, 400, 500])
normalized = (data - data.min()) / (data.max() - data.min())
print(normalized) # [0. 0.25 0.5 0.75 1. ]
Comparison Operations
Comparisons also work element-wise, returning boolean arrays:
a = np.array([1, 2, 3, 4, 5])
b = np.array([5, 4, 3, 2, 1])
You can compare arrays to each other or to a scalar:
print(a > b) # [False False False True True]
print(a == b) # [False False True False False]
print(a >= 3) # [False False True True True]
Boolean arrays can be used for filtering:
# Use boolean array for indexing
print(a[a > 2]) # [3 4 5]
Operations on 2D Arrays
Element-wise operations work the same way for multi-dimensional arrays:
A = np.array([[1, 2],
[3, 4]])
B = np.array([[10, 20],
[30, 40]])
Element-wise addition adds corresponding elements:
print(A + B)
# [[11 22]
# [33 44]]
Element-wise multiplication multiplies corresponding elements (this is NOT matrix multiplication!):
print(A * B)
# [[ 10 40]
# [ 90 160]]
# [ 9 16]]
A * B is element-wise multiplication. For matrix multiplication,
use A @ B or np.dot(A, B). We'll cover this in Topic 3.3.
Practice Questions: Element-wise Operations
Test your understanding of array operations.
Given:
prices = np.array([10, 25, 50, 100])
Task: Double all prices.
Expected output: [ 20 50 100 200]
Show Solution
prices = np.array([10, 25, 50, 100])
doubled = prices * 2
print(doubled) # [ 20 50 100 200]
Given:
old_prices = np.array([100, 200, 150])
new_prices = np.array([120, 180, 165])
Task: Calculate percentage change for each item.
Formula: ((new - old) / old) * 100
Show Solution
pct_change = ((new_prices - old_prices) / old_prices) * 100
print(pct_change) # [ 20. -10. 10.]
Given:
data = np.array([20, 50, 80, 30, 100])
Task: Normalize values to range [0, 1].
Formula: (x - min) / (max - min)
Show Solution
data = np.array([20, 50, 80, 30, 100])
normalized = (data - data.min()) / (data.max() - data.min())
print(normalized) # [0. 0.375 0.75 0.125 1. ]
Given:
scores = np.array([45, 72, 88, 55, 91, 67, 83])
Task: Find all scores between 60 and 85 (inclusive).
Hint: Use & with parentheses around each condition
Show Solution
scores = np.array([45, 72, 88, 55, 91, 67, 83])
in_range = scores[(scores >= 60) & (scores <= 85)]
print(in_range) # [72 67 83]
Given:
point1 = np.array([1, 2])
point2 = np.array([4, 6])
Task: Calculate Euclidean distance between points.
Formula: sqrt((x2-x1)² + (y2-y1)²)
Show Solution
point1 = np.array([1, 2])
point2 = np.array([4, 6])
diff = point2 - point1
distance = np.sqrt(np.sum(diff ** 2))
print(distance) # 5.0
Broadcasting
Broadcasting is NumPy's powerful mechanism for working with arrays of different shapes. It allows operations between arrays that don't have the same dimensions by automatically "stretching" the smaller array.
What is Broadcasting?
When you add a scalar to an array, NumPy "broadcasts" the scalar across all elements. But broadcasting works with arrays of different shapes too:
Interactive: Broadcasting Visualizer
Explore!Click different scenarios to see how NumPy broadcasts arrays of different shapes.
Array A
Shape: (3,)
Array B
Shape: scalar
Result
Shape: (3,)
Scalar Broadcasting: The scalar 10 is virtually expanded to [10, 10, 10] to match the array shape. No extra memory is used.
import numpy as np
# Simple broadcasting: scalar to array
arr = np.array([1, 2, 3])
print(arr + 10) # [11 12 13]
# The scalar 10 is "broadcast" to [10, 10, 10]
# Broadcasting: 1D array with 2D array
matrix = np.array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
row = np.array([10, 20, 30])
print(matrix + row)
# [[11 22 33]
# [14 25 36]
# [17 28 39]]
# The row is broadcast across all rows of the matrix
Without Broadcasting
You'd need to manually replicate the row 3 times, then add.
With Broadcasting
NumPy automatically handles the replication - no extra memory!
Broadcasting Rules
NumPy compares shapes from right to left. Two dimensions are compatible when:
- They are equal, or
- One of them is 1
| Array A | Array B | Result | Why? |
|---|---|---|---|
(3, 4) |
(4,) |
(3, 4) |
✓ (4) == (4), (3) broadcasts with missing dim |
(3, 4) |
(3, 1) |
(3, 4) |
✓ (4) broadcasts with (1), (3) == (3) |
(3, 4) |
(3,) |
Error | ✗ (4) != (3) and neither is 1 |
(2, 3, 4) |
(3, 4) |
(2, 3, 4) |
✓ Last two dims match |
(2, 3, 4) |
(1,) |
(2, 3, 4) |
✓ (1) broadcasts with everything |
Practical Broadcasting Examples
Example 1: Subtract the column mean from each column. The
column means have shape (3,) which broadcasts with (3, 3):
data = np.array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
col_means = data.mean(axis=0) # [4. 5. 6.] - shape (3,)
centered = data - col_means
print(centered)
# [[-3. -3. -3.]
# [ 0. 0. 0.]
# [ 3. 3. 3.]]
Example 2: Subtract the row mean from each row. Use
keepdims=True to get shape (3, 1) instead of (3,):
row_means = data.mean(axis=1, keepdims=True) # [[2.], [5.], [8.]]
centered = data - row_means
print(centered)
# [[-1. 0. 1.]
# [-1. 0. 1.]
# [-1. 0. 1.]]
Example 3: Compute an outer product using broadcasting. Adding
a dimension with np.newaxis changes shape from (3,) to
(3, 1):
a = np.array([1, 2, 3])[:, np.newaxis] # Shape (3, 1)
b = np.array([10, 20, 30]) # Shape (3,)
print(a * b) # (3, 1) * (3,) = (3, 3)
# [[ 10 20 30]
# [ 20 40 60]
# [ 30 60 90]]
keepdims=True in aggregations to preserve
dimensions for broadcasting. Use np.newaxis or reshape()
to add dimensions.
Practice Questions: Broadcasting
Test your understanding of broadcasting rules.
Task: Determine if these shapes are broadcast-compatible and what the result shape would be:
(5, 3)and(3,)(4, 1)and(3,)(2, 3)and(3, 2)
Show Solution
(5, 3) + (3,) → ✓ Result: (5, 3)
(4, 1) + (3,) → ✓ Result: (4, 3)
(2, 3) + (3, 2) → ✗ Error! Neither dimension matches
Task: Normalize each row of this matrix by dividing by the row sum:
data = np.array([[1, 2, 3],
[4, 5, 6]])
Show Solution
row_sums = data.sum(axis=1, keepdims=True)
normalized = data / row_sums
print(normalized)
# [[0.167 0.333 0.5 ]
# [0.267 0.333 0.4 ]]
Universal Functions (ufuncs)
Universal functions (ufuncs) are NumPy's optimized functions that operate on arrays element-by-element. They're implemented in compiled C code, making them incredibly fast.
What are Universal Functions?
When you use np.sqrt(arr) or np.sin(arr), you're using ufuncs.
They automatically apply to every element without explicit loops:
import numpy as np
arr = np.array([1, 4, 9, 16, 25])
# ufunc example: square root
print(np.sqrt(arr)) # [1. 2. 3. 4. 5.]
# Much faster than Python loops
# Slow way (don't do this):
# result = [math.sqrt(x) for x in arr]
# Fast way (ufunc):
result = np.sqrt(arr)
Mathematical ufuncs
| Category | Functions | Description |
|---|---|---|
| Basic Math | add, subtract, multiply, divide, power |
Arithmetic operations |
| Rounding | floor, ceil, round, trunc |
Round to nearest integer |
| Exponential | exp, exp2, log, log2, log10 |
Exponentials and logarithms |
| Trigonometric | sin, cos, tan, arcsin, arccos, arctan |
Trig functions (radians) |
| Absolute | abs, absolute, fabs |
Absolute value |
| Signs | sign, negative, positive |
Sign operations |
# Examples of common ufuncs
arr = np.array([-2.5, -1.2, 0, 1.7, 3.9])
print(np.abs(arr)) # [2.5 1.2 0. 1.7 3.9]
print(np.floor(arr)) # [-3. -2. 0. 1. 3.]
print(np.ceil(arr)) # [-2. -1. 0. 2. 4.]
print(np.round(arr)) # [-2. -1. 0. 2. 4.]
print(np.sign(arr)) # [-1. -1. 0. 1. 1.]
Exponentials and logarithms are essential for data science calculations like normalizing values or working with probability distributions:
arr = np.array([1, 2, 3])
print(np.exp(arr)) # [ 2.718 7.389 20.086]
print(np.log(arr)) # [0. 0.693 1.099] (natural log)
print(np.log10(arr)) # [0. 0.301 0.477] (base 10)
Trigonometric functions work with angles in radians. These are used frequently in signal processing and feature engineering:
angles = np.array([0, np.pi/4, np.pi/2, np.pi])
print(np.sin(angles)) # [0. 0.707 1. 0. ]
print(np.cos(angles)) # [1. 0.707 0. -1. ]
Binary ufuncs (Two Inputs)
Binary ufuncs take two arrays as input and compute element-wise results:
a = np.array([1, 5, 10])
b = np.array([2, 3, 4])
np.maximum() and np.minimum() compare element-wise:
print(np.maximum(a, b)) # [ 2 5 10]
print(np.minimum(a, b)) # [1 3 4]
Other useful binary ufuncs include power, modulo, and comparisons:
print(np.power(a, b)) # [ 1 125 10000]
print(np.mod(a, b)) # [1 2 2]
print(np.greater(a, b)) # [False True True]
The out Parameter
For memory efficiency, you can specify an output array instead of creating a new one.
Without out, NumPy creates a new array:
arr = np.array([1, 2, 3, 4])
result = np.square(arr)
print(result) # [ 1 4 9 16]
With out, NumPy writes to an existing array (no new allocation):
output = np.empty(4)
np.square(arr, out=output)
print(output) # [ 1. 4. 9. 16.]
You can even do in-place operations by setting the output to the same array:
np.square(arr, out=arr) # Modify arr directly
print(arr) # [ 1 4 9 16]
out= avoids memory allocation,
which can speed up tight loops significantly. Especially useful in iterative algorithms.
Practice Questions: Universal Functions
Apply ufuncs to solve these exercises.
Task: Create an array [1, 4, 9, 16, 25] and compute the square root of each element.
Show Solution
arr = np.array([1, 4, 9, 16, 25])
print(np.sqrt(arr)) # [1. 2. 3. 4. 5.]
Given:
a = np.array([3, 7, 2, 8])
b = np.array([5, 1, 9, 4])
Task: Find the element-wise maximum of the two arrays.
Show Solution
result = np.maximum(a, b)
print(result) # [5 7 9 8]
Aggregations
Aggregation functions reduce arrays to single values (or smaller arrays). They're essential for computing statistics: sum, mean, max, min, standard deviation, and more.
Basic Aggregation Functions
import numpy as np
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
Calculate sum and product of all elements:
print(np.sum(arr)) # 55
print(np.prod(arr)) # 3628800
Find minimum and maximum values:
print(np.min(arr)) # 1
print(np.max(arr)) # 10
Find the index of min and max values:
print(np.argmin(arr)) # 0 (index of minimum)
print(np.argmax(arr)) # 9 (index of maximum)
Calculate statistics like mean, median, and standard deviation:
print(np.mean(arr)) # 5.5
print(np.median(arr)) # 5.5
print(np.std(arr)) # 2.872 (standard deviation)
| Function | Description | Method Syntax |
|---|---|---|
np.sum() | Sum of elements | arr.sum() |
np.prod() | Product of elements | arr.prod() |
np.mean() | Arithmetic mean | arr.mean() |
np.std() | Standard deviation | arr.std() |
np.var() | Variance | arr.var() |
np.min() | Minimum value | arr.min() |
np.max() | Maximum value | arr.max() |
np.argmin() | Index of minimum | arr.argmin() |
np.argmax() | Index of maximum | arr.argmax() |
np.cumsum() | Cumulative sum | arr.cumsum() |
np.cumprod() | Cumulative product | arr.cumprod() |
Aggregating Along Axes
For multi-dimensional arrays, use the axis parameter to aggregate
along specific dimensions:
arr = np.array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
Without specifying axis, the function aggregates all elements:
# No axis: aggregate all elements
print(np.sum(arr)) # 45 (sum of all)
With axis=0, aggregate down columns (collapse rows):
# axis=0: sum each column
print(np.sum(arr, axis=0)) # [12 15 18]
With axis=1, aggregate across rows (collapse columns):
# axis=1: sum each row
print(np.sum(arr, axis=1)) # [ 6 15 24]
axis=0
Collapse rows → result has one value per column. "Sum down each column"
axis=1
Collapse columns → result has one value per row. "Sum across each row"
Use keepdims=True to preserve dimensions for broadcasting:
# More examples with axis
print(np.mean(arr, axis=0)) # [4. 5. 6.] (column means)
print(np.mean(arr, axis=1)) # [2. 5. 8.] (row means)
print(np.max(arr, axis=0)) # [7 8 9] (max in each column)
print(np.max(arr, axis=1)) # [3 6 9] (max in each row)
# Keepdims preserves dimensions (useful for broadcasting)
col_means = np.mean(arr, axis=0, keepdims=True)
print(col_means.shape) # (1, 3) instead of (3,)
Boolean Aggregations
arr = np.array([1, 2, 3, 4, 5])
Check if any element satisfies a condition:
print(np.any(arr > 3)) # True (4 and 5 are > 3)
Check if all elements satisfy a condition:
print(np.all(arr > 0)) # True (all positive)
print(np.all(arr > 3)) # False (1, 2, 3 are not)
Count how many elements satisfy a condition:
# Sum of True values = count
print(np.sum(arr > 2)) # 3 (three values > 2)
# Mean of True values = percentage
print(np.mean(arr > 2)) # 0.6 (60% are > 2)
Cumulative Operations
Cumulative operations compute running totals:
arr = np.array([1, 2, 3, 4, 5])
print(np.cumsum(arr)) # [ 1 3 6 10 15]
print(np.cumprod(arr)) # [ 1 2 6 24 120]
Practical example - tracking running sum of sales:
daily_sales = np.array([100, 150, 200, 180, 220])
total_to_date = np.cumsum(daily_sales)
print(total_to_date) # [100 250 450 630 850]
arr.sum()) and function
syntax (np.sum(arr)) do the same thing. Use whichever you prefer!
Practice Questions: Aggregations
Master statistical operations with these exercises.
Given:
scores = np.array([85, 92, 78, 95, 88, 76, 90])
Task: Find the mean, median, and standard deviation.
Show Solution
scores = np.array([85, 92, 78, 95, 88, 76, 90])
print(f"Mean: {np.mean(scores):.2f}") # 86.29
print(f"Median: {np.median(scores)}") # 88.0
print(f"Std: {np.std(scores):.2f}") # 6.74
Given:
temps = np.array([23, 28, 31, 25, 29, 27])
Task: Find the index of the hottest day.
Show Solution
temps = np.array([23, 28, 31, 25, 29, 27])
hottest_day = np.argmax(temps)
print(f"Hottest day index: {hottest_day}") # 2
print(f"Temperature: {temps[hottest_day]}") # 31
Given:
sales = np.array([[100, 200, 150],
[120, 180, 160],
[110, 220, 140]])
Task: Find the total sales for each product (columns).
Expected output: [330 600 450]
Show Solution
product_totals = np.sum(sales, axis=0)
print(product_totals) # [330 600 450]
Given:
grades = np.array([72, 85, 90, 65, 88, 78, 92, 58, 81])
Task: What percentage of students scored above 80?
Show Solution
grades = np.array([72, 85, 90, 65, 88, 78, 92, 58, 81])
pct_above_80 = np.mean(grades > 80) * 100
print(f"{pct_above_80:.1f}%") # 55.6%
Given:
daily_revenue = np.array([500, 800, 650, 920, 750, 1100, 890])
Task: Calculate the running total and find on which day (index) it first exceeds 3000.
Show Solution
daily_revenue = np.array([500, 800, 650, 920, 750, 1100, 890])
running_total = np.cumsum(daily_revenue)
print(running_total) # [ 500 1300 1950 2870 3620 4720 5610]
day = np.argmax(running_total > 3000)
print(f"Day {day}: {running_total[day]}") # Day 4: 3620
Key Takeaways
Vectorization Power
Operations like arr * 2 apply to all elements without explicit loops - this is what makes NumPy fast
Element-wise Math
Standard operators +, -, *, /, ** work element-by-element on arrays
Broadcasting Rules
Arrays of different shapes can work together. Dimensions must be equal or one must be 1
Universal Functions
ufuncs are optimized C functions: np.sqrt(), np.exp(), np.sin(), and many more
Aggregations
Reduce arrays with sum(), mean(), std(), min(), max()
Axis Parameter
axis=0 aggregates along rows (column result), axis=1 along columns (row result)
Knowledge Check
Test your understanding of array operations: