Python Basics

Why Python for Data Science?

Python has become the undisputed king of data science. But why? Let's explore the history, philosophy, and reasons behind Python's dominance in the field.

What is Programming?

Before we dive into Python, let's understand what programming actually means. Programming is the process of giving instructions to a computer to perform specific tasks. Think of it like writing a recipe: you provide step-by-step instructions, and the computer follows them exactly.

Computers don't understand human language. They only understand binary, which is sequences of 1s and 0s. Programming languages like Python act as a bridge between human thinking and machine code. When you write Python code, it gets translated into instructions the computer can execute.

Human Writes Code

print("Hello!")

Python Interprets

Translates to machine code

Computer Executes

Displays "Hello!" on screen

Your First Python Program: Hello, World!

Every programmer's journey begins with a simple tradition: writing a "Hello, World!" program. This tradition dates back to the 1970s and serves as a quick way to verify that your programming environment is set up correctly.

In Python, this is remarkably simple. Just one line of code:

# Your very first Python program!
# The print() function displays text on the screen

print("Hello, World!")

Output:

Hello, World!

Let's break down what's happening here:

# - Lines starting with a hash symbol are comments. Python ignores these lines. Comments are notes for humans reading the code.
print() - This is a function built into Python. Its job is to display output on the screen.
"Hello, World!" - This is a string (text). The quotes tell Python this is text, not a command.

Try It Yourself: If you've installed Python (see Module 1.2), open your terminal or command prompt, type python to start the Python interpreter, then type the print statement above. You've just written your first program!

More Examples with print()

# Printing different messages
print("Welcome to Python programming!")
print("My name is [Your Name]")
print("I am learning data science!")

# Printing numbers (no quotes needed for numbers)
print(42)
print(3.14159)

# Printing multiple items (separated by comma)
print("The answer is:", 42)

# Using + to join strings (called concatenation)
print("Hello, " + "World!")

# Empty print() creates a blank line
print()
print("This appears after a blank line")

Output:

Welcome to Python programming!
My name is [Your Name]
I am learning data science!
42
3.14159
The answer is: 42
Hello, World!

This appears after a blank line

A Brief History of Python

Python was created by Guido van Rossum, a Dutch programmer, and first released in 1991. At the time, Guido was working at Centrum Wiskunde & Informatica (CWI) in the Netherlands. He wanted to create a language that was easy to read and enjoyable to use.

Fun fact: The name "Python" doesn't come from the snake. It's named after the British comedy group Monty Python's Flying Circus, which Guido enjoyed watching while developing the language. This playful origin reflects Python's philosophy of making programming fun.

Guido designed Python with a clear philosophy: code should be readable. He believed that code is read more often than it is written, so it should be easy to understand at a glance. This philosophy is captured in "The Zen of Python", a collection of 19 guiding principles that every Python programmer should know.

Timeline

1989: Development began
1991: Python 1.0 released
2000: Python 2.0 released
2008: Python 3.0 released
2020: Python 2 end of life
Today: Python 3.11+ is standard

Python has evolved significantly over the years. Python 2 was released in 2000 and became hugely popular. Python 3, released in 2008, introduced major improvements but wasn't backward compatible with Python 2. This caused a slow transition period, but as of 2020, Python 2 is officially retired. Today, everyone uses Python 3, and you should too!

The Zen of Python: Type import this in Python to see the 19 guiding principles. Key ones include: "Beautiful is better than ugly", "Simple is better than complex", and "Readability counts."

# Try this in Python!
import this

# Output shows "The Zen of Python" by Tim Peters
# Beautiful is better than ugly.
# Explicit is better than implicit.
# Simple is better than complex.
# Complex is better than complicated.
# Readability counts.
# ...

Why Python Dominates Data Science

According to Stack Overflow's Developer Survey and GitHub's Octoverse, Python consistently ranks as the #1 language for data science and machine learning. Here's why:

Easy to Learn

Python's syntax reads like English. No curly braces, no semicolons, just clean, indented code. Beginners can start building useful programs in days, not months.

Rich Ecosystem

NumPy, pandas, matplotlib, scikit-learn, TensorFlow, PyTorch: Python has world-class libraries for every data science task, all free and open source.

Huge Community

Millions of developers, thousands of tutorials, and countless Stack Overflow answers. Whatever problem you face, someone has solved it before.

Industry Adoption

Netflix, Google, Facebook, Instagram, Spotify, NASA: all use Python for data science. Learning Python opens doors to top companies worldwide.

Integration Power

Python integrates with databases (SQL, MongoDB), web frameworks (Flask, Django), cloud services (AWS, Azure), and even other languages (C, C++, Java).

Rapid Prototyping

Python lets you go from idea to working prototype quickly. Data scientists can experiment, iterate, and validate hypotheses faster than in any other language.

Python vs Other Languages

How does Python compare to other languages used in data science?

Aspect	Python	R	Julia	SQL
Learning Curve	Easy	Moderate	Moderate	Easy
Primary Use	General purpose + DS	Statistical analysis	High-performance computing	Database queries
ML Libraries	Excellent	Good	Growing	Limited
Visualization	Excellent	Excellent	Good	None
Speed	Moderate	Slow	Very Fast	Fast (DB)
Job Market	Highest	Good	Niche	Essential
Best For	End-to-end DS workflow	Academic research	Scientific computing	Data extraction

Recommendation: Start with Python as your primary language. Learn SQL for database queries. Consider R for specialized statistics or Julia if you need extreme performance later in your career.

Essential Data Science Libraries

Here's a preview of the powerful libraries you'll use throughout this course:

NumPy

Fast numerical arrays and mathematical operations. Foundation for all scientific computing in Python.

pandas

DataFrames for data manipulation. Think Excel on steroids: filter, group, merge with ease.

Matplotlib

Create static, animated, and interactive visualizations. The grandfather of Python plotting.

Seaborn

Beautiful statistical visualizations built on Matplotlib. Perfect for EDA and presentations.

Scikit-learn

Machine learning made simple. Classification, regression, clustering: all in one package.

TensorFlow/PyTorch

Deep learning frameworks for neural networks, computer vision, and NLP.

# Importing the essential data science stack
import numpy as np          # Numerical computing
import pandas as pd         # Data manipulation
import matplotlib.pyplot as plt  # Visualization
import seaborn as sns       # Statistical plots
from sklearn import *       # Machine learning

# Now you're ready for data science!

Variables & Data Types

Variables are the building blocks of any program. They store data that your program can use and manipulate. Let's master Python's variable system and core data types.

What is a Variable?

Imagine you're organizing a warehouse. You have boxes, and each box has a label so you know what's inside. In programming, variables work exactly the same way. They're labeled containers that hold data.

Key Concept

Variable

A variable is a named reference to a value stored in your computer's memory. Think of it as a labeled box where you can store and retrieve data whenever you need it.

Unlike mathematics where x = 5 means "x equals 5", in programming it means "store the value 5 in a container called x". You can change what's inside the container at any time - that's why it's called a variable (it can vary).

Why it matters: Variables are the foundation of all programming. Every piece of data your program works with - user names, prices, temperatures, images - gets stored in variables.

Real-World Analogy

Think of your computer's memory as a giant warehouse with millions of storage boxes. Each box has a unique address (like "Box #1234567"). A variable is simply a friendly name you give to one of these boxes so you can easily find it later. Instead of saying "get the value from memory address 0x7fff5fbff8c", you can just say "get age".

Assignment Operator

The = sign in Python is called the assignment operator. It doesn't mean "equals" in the mathematical sense. Instead, it means "store this value in this variable". Think of it as an arrow pointing left: the value on the right goes into the variable on the left.

In Python, creating a variable is simple. You just assign a value using the = operator:

# Creating variables
name = "Priya"           # A string variable
age = 25                 # An integer variable
height = 5.7             # A float variable
is_student = True        # A boolean variable

# Python figures out the type automatically!
print(name)              # Output: Priya
print(age)               # Output: 25

Dynamic Typing: Unlike Java or C++, Python is dynamically typed. You don't need to declare a variable's type. Python infers it from the value you assign.

Variable Naming Rules & Conventions

Valid Names

name - lowercase letters
user_name - snake_case (recommended)
userName - camelCase (less common)
_private - starts with underscore
__dunder__ - double underscores (special)
name2 - letters with numbers
MAX_SIZE - UPPERCASE for constants

Invalid Names

2name - can't start with number
my-name - no hyphens allowed
my name - no spaces allowed
class - reserved keyword
for - reserved keyword
import - reserved keyword
@special - no special characters

# Python naming conventions (PEP 8)
user_name = "priya"           # snake_case for variables
MAX_CONNECTIONS = 100         # UPPER_CASE for constants
_internal_value = 42          # Leading underscore = "private"
__name__ = "special"          # Dunder = Python special methods

# Reserved keywords (35 total in Python 3.11)
# False, None, True, and, as, assert, async, await,
# break, class, continue, def, del, elif, else, except,
# finally, for, from, global, if, import, in, is, lambda,
# nonlocal, not, or, pass, raise, return, try, while, with, yield

Core Data Types

In the real world, we deal with different types of information: numbers, text, yes/no answers, etc. Programming languages also need to distinguish between these different types of data. In Python, we call these data types.

Why does it matter? Different data types have different behaviors. For example, you can add two numbers together (5 + 3 = 8), but "adding" two pieces of text joins them together ("Hello" + "World" = "HelloWorld"). Understanding data types helps you write correct code.

Interactive: Explore Python Data Types

Click to Learn!

Click on any data type card to see details, examples, and common operations.

🔢

int

💧

float

3.14

📝

str

"Hello"

✅

bool

True

📋

list

[1, 2, 3]

🗂️

dict

{"a": 1}

Click a data type above to see details

Let's explore each one in detail:

int

Integers (Whole Numbers)

Integers (abbreviated as int) are whole numbers without decimal points. They can be positive, negative, or zero. Examples: your age, the year, or items in a cart.

Basic integer usage:

age = 25
year = 2025
negative = -100
print(type(age))  #

For large numbers, use underscores as visual separators (Python ignores them):

big_number = 1_000_000_000
print(big_number)  # 1000000000

Python supports different number bases (binary, octal, hexadecimal):

binary = 0b1010        # Binary (base 2) = 10
octal = 0o17           # Octal (base 8) = 15  
hexadecimal = 0xFF     # Hex (base 16) = 255

Unlike many languages, Python handles arbitrarily large integers:

huge = 10 ** 100  # 1 followed by 100 zeros
print(huge)       # Works perfectly!

Tip: Use underscores in large numbers (1_000_000) for readability.

float

Floating-Point Numbers (Decimals)

Floats (short for "floating-point numbers") represent decimal numbers, meaning numbers with a decimal point. The name comes from how computers store these numbers internally: the decimal point can "float" to different positions to represent very large or very small numbers.

Everyday examples include: prices ($19.99), temperatures (98.6°F), or percentages (0.75 for 75%).

Basic float usage:

price = 19.99
temperature = -40.5
pi = 3.14159265359
print(type(price))  #

For very large or small numbers, use scientific notation:

avogadro = 6.022e23  # 6.022 × 10²³
tiny = 1.6e-19       # 1.6 × 10⁻¹⁹

Precision Warning: Floats have precision issues because computers store them in binary.

result = 0.1 + 0.2
print(result)         # 0.30000000000000004 (not 0.3!)
print(result == 0.3)  # False!

Solution: Use round() or math.isclose() for comparisons:

import math
print(round(result, 1))           # 0.3
print(math.isclose(result, 0.3))  # True

str

Strings (Text)

Strings (abbreviated as str) are sequences of characters. They represent text data. Names, addresses, email content, product descriptions, entire books: anything that's text is stored as a string in Python.

Strings are immutable - once created, they can't be changed, but you can always create new strings from old ones.

Create strings using single, double, or triple quotes:

name = 'Alice'
greeting = "Hello, World!"
multiline = """This spans
multiple lines."""

Escape characters let you include special characters:

tab = "Col1\tCol2"        # \t = tab
newline = "Line1\nLine2"  # \n = newline
path = "C:\\Users\\Data"  # \\ = backslash

Raw strings (with r prefix) ignore escapes - useful for file paths:

raw_path = r"C:\Users\Data"
print(raw_path)  # C:\Users\Data

String indexing and slicing works like lists:

text = "Python"
print(text[0])     # 'P' (first character)
print(text[-1])    # 'n' (last character)
print(text[0:3])   # 'Pyt' (slice)
print(text[::-1])  # 'nohtyP' (reverse)

F-Strings (Formatted String Literals)

F-strings (Python 3.6+) are the modern way to format strings. Add f before the quotes and use {} to embed expressions:

name = "Priya"
age = 25
message = f"My name is {name} and I'm {age} years old."
print(message)  # My name is Priya and I'm 25 years old.

You can put expressions inside the curly braces:

print(f"Next year I'll be {age + 1}")  # Next year I'll be 26

Number formatting with f-strings:

price = 1234.5678
print(f"Price: ${price:.2f}")      # Price: $1234.57 (2 decimals)
print(f"Price: ${price:,.2f}")     # Price: $1,234.57 (with commas)
print(f"Percentage: {0.856:.1%}")  # Percentage: 85.6%

Alignment and padding:

print(f"{'left':<10}|")    # left      | (left align)
print(f"{'right':>10}|")   #      right| (right align)
print(f"{'center':^10}|")  #  center   | (center)

bool

Booleans (True/False)

Booleans represent truth values: True or False (note the capital letters).

is_active = True
is_admin = False
print(type(is_active))  #

Booleans are typically created from comparisons:

print(5 > 3)           # True
print(10 == 20)        # False
print("a" in "apple")  # True

Truthiness: Python converts other values to boolean when needed. "Truthy" values evaluate to True, "falsy" values evaluate to False:

# Truthy values
print(bool(1))          # True (non-zero number)
print(bool("hello"))    # True (non-empty string)
print(bool([1, 2, 3]))  # True (non-empty list)

# Falsy values
print(bool(0))    # False
print(bool(""))   # False (empty string)
print(bool([]))   # False (empty list)
print(bool(None)) # False

This is useful in conditions:

name = ""
if name:
    print(f"Hello, {name}")
else:
    print("Name is empty!")  # This runs

Falsy Values: False, None, 0, 0.0, "", [], {}, (). Everything else is truthy!

None

NoneType (Absence of Value)

None is Python's null value. It represents the absence of a value or a missing value. It's not the same as 0, False, or an empty string.

# None represents "no value" or "nothing"
result = None
print(result)                 # None
print(type(result))           #

Common use case: Functions that might not find what they're looking for return None:

# Common use cases for None
def find_user(user_id):
    """Returns user if found, None otherwise"""
    users = {1: "Priya", 2: "Rahul"}
    return users.get(user_id)  # Returns None if not found

user = find_user(999)
if user is None:              # Use 'is' to check for None, not ==
    print("User not found!")

Functions without an explicit return statement return None:

# Functions without return statement return None
def greet(name):
    print(f"Hello, {name}!")
    # No return statement

result = greet("Priya")       # Prints: Hello, Priya!
print(result)                 # None

None is often used as a default parameter value:

# Default parameter values
def connect(host, port=None):
    if port is None:
        port = 8080           # Use default if not provided
    print(f"Connecting to {host}:{port}")

Checking & Converting Types

Python provides built-in functions to check and convert between types.

Checking types with type() and isinstance():

# Checking types with type() and isinstance()
x = 42
print(type(x))                # 
print(type(x) == int)         # True
print(isinstance(x, int))     # True (preferred method)

# isinstance() can check multiple types
value = 3.14
print(isinstance(value, (int, float)))  # True (is int OR float)

Type conversion (casting) - converting between types:

# String to number
age_str = "25"
age_int = int(age_str)        # 25 (integer)
price_float = float("19.99")  # 19.99 (float)

# Number to string
num = 42
num_str = str(num)            # "42" (string)

Boolean conversion to numbers:

# Boolean conversion
print(int(True))              # 1
print(int(False))             # 0
print(float(True))            # 1.0

Warning: Invalid conversions raise errors:

# WARNING: Invalid conversions raise errors
# int("hello")                # ValueError: invalid literal
# int("3.14")                 # ValueError: can't convert float string to int
# float("abc")                # ValueError: could not convert string to float

Safe conversion using try/except:

# Safe conversion with try/except
def safe_int(value):
    try:
        return int(value)
    except ValueError:
        return None

print(safe_int("42"))         # 42
print(safe_int("hello"))      # None

Quick Reference: Python Data Types

Type	Example	Mutable?	Description
`int`	`42`, `-10`, `0b1010`	No	Whole numbers, arbitrary precision
`float`	`3.14`, `-0.5`, `1e-5`	No	Decimal numbers, 64-bit precision
`str`	`"hello"`, `'world'`, `f"{x}"`	No	Text sequences, Unicode support
`bool`	`True`, `False`	No	Truth values for conditions
`NoneType`	`None`	N/A	Represents absence of value

Practice Questions: Variables & Data Types

Test your understanding with these specific challenges.

Given:

celsius = 37.5

Task: Convert to Fahrenheit using the formula: F = (C × 9/5) + 32

Expected output: 99.5

Show Solution

celsius = 37.5
fahrenheit = (celsius * 9/5) + 32
print(fahrenheit)  # 99.5

Given:

price_str = "49.99"
quantity_str = "3"

Task: Calculate the total cost. Expected output: 149.97

Show Solution

price_str = "49.99"
quantity_str = "3"

price = float(price_str)
quantity = int(quantity_str)
total = price * quantity

print(total)  # 149.97

Given:

weight_kg = 68.5
height_m = 1.75

Task: Calculate BMI using the formula: BMI = weight / (height²)

Display result with message: "BMI: 22.37"

Show Solution

weight_kg = 68.5
height_m = 1.75

bmi = weight_kg / (height_m ** 2)
print(f"BMI: {bmi:.2f}")  # BMI: 22.37

Given:

a = 10
b = 25

Task: Swap the values of a and b without using a temporary variable. After swapping: a = 25, b = 10

Show Solution

a = 10
b = 25

a, b = b, a  # Python's tuple unpacking

print(a)  # 25
print(b)  # 10

Given:

amount = 1234567.891

Task: Format as currency with commas and 2 decimal places. Expected output: $1,234,567.89

Hint: Use f-string formatting with :,.2f

Show Solution

amount = 1234567.891
formatted = f"${amount:,.2f}"
print(formatted)  # $1,234,567.89

Operators

Operators are symbols that perform operations on values and variables. Python has arithmetic, comparison, logical, assignment, and special operators. Let's master them all!

What is an Operator?

An operator is a symbol that tells Python to perform a specific operation. You've already used operators in math class: + for addition, - for subtraction. Programming operators work the same way, but Python has many more operators for different purposes.

Operands

The values that operators work on are called operands. In the expression 5 + 3, the operator is + and the operands are 5 and 3. Most operators work with two operands (binary operators), but some work with just one (unary operators, like -x).

Arithmetic Operators

These operators perform mathematical calculations, the same math you learned in school. Python supports all standard math operations plus some extras like floor division and exponentiation that are especially useful in data science.

Operator	Name	Example	Result	Description
`+`	Addition	`5 + 3`	8	Adds two numbers
`-`	Subtraction	`10 - 4`	6	Subtracts right from left
`*`	Multiplication	`7 * 6`	42	Multiplies two numbers
`/`	Division	`15 / 4`	3.75	Always returns float
`//`	Floor Division	`15 // 4`	3	Divides and rounds down
`%`	Modulo	`17 % 5`	2	Returns remainder
`**`	Exponentiation	`2 ** 10`	1024	Power operation

# Arithmetic operators in action
a, b = 17, 5

print(f"Addition: {a} + {b} = {a + b}")           # 22
print(f"Subtraction: {a} - {b} = {a - b}")        # 12
print(f"Multiplication: {a} * {b} = {a * b}")     # 85
print(f"Division: {a} / {b} = {a / b}")           # 3.4 (always float!)
print(f"Floor Division: {a} // {b} = {a // b}")   # 3 (rounded down)
print(f"Modulo: {a} % {b} = {a % b}")             # 2 (remainder)
print(f"Power: {a} ** 2 = {a ** 2}")              # 289

Common use cases for arithmetic operators:

# Check if number is even or odd
num = 42
is_even = num % 2 == 0        # True (no remainder when divided by 2)

# Get digits of a number
number = 12345
last_digit = number % 10      # 5
remove_last = number // 10    # 1234

Calculating percentages and roots:

# Calculate percentage
part = 75
whole = 200
percentage = (part / whole) * 100  # 37.5%

# Square root using exponent
sqrt_16 = 16 ** 0.5           # 4.0
cube_root = 27 ** (1/3)       # 3.0

Operator Precedence: ** → * / // % → + - (same as math). The ** operator is right-associative: 2**3**2 = 2**(3**2) = 512, not 64.

Comparison Operators

Comparison operators compare values and return a boolean (True or False). They're essential for conditions and filtering data.

Operator	Name	Example	Result
`==`	Equal to	`5 == 5`	True
`!=`	Not equal to	`5 != 3`	True
`>`	Greater than	`10 > 5`	True
`<`	Less than	`3 < 7`	True
`>=`	Greater than or equal	`5 >= 5`	True
`<=`	Less than or equal	`4 <= 3`	False

# Comparison operators
x, y = 10, 20

print(x == y)                 # False
print(x != y)                 # True
print(x > y)                  # False
print(x < y)                  # True
print(x >= 10)                # True
print(y <= 15)                # False

Chained comparisons are one of Python's superpowers:

# Chained comparisons (Python's superpower!)
age = 25
print(18 <= age <= 65)        # True (is age between 18 and 65?)
# Same as: age >= 18 and age <= 65

score = 85
print(80 <= score < 90)       # True (B grade range)

String comparisons work lexicographically (alphabetically):

# String comparisons (lexicographic/alphabetical)
print("apple" < "banana")     # True (a comes before b)
print("Apple" < "apple")      # True (uppercase < lowercase in ASCII)
print("10" < "9")             # True (string comparison, not numeric!)

# NOTE: Be careful with types!
print(10 == "10")             # False (int vs string)
print(10 == 10.0)             # True (int and float can be equal)

Logical Operators

Logical operators combine boolean expressions. They're the building blocks of complex conditions.

Operator	Description	Example	Result
`and`	True if BOTH are True	`True and True`	True
`or`	True if AT LEAST ONE is True	`True or False`	True
`not`	Inverts the boolean value	`not True`	False

# Logical operators
age = 25
has_license = True
is_insured = False

# and - both conditions must be True
can_drive = age >= 18 and has_license
print(can_drive)              # True

# or - at least one condition must be True
needs_attention = not has_license or not is_insured
print(needs_attention)        # True (not insured)

# not - inverts the boolean
print(not True)               # False
print(not False)              # True

Combine operators for complex conditions:

# Complex conditions
income = 50000
credit_score = 720
has_job = True

# Can get a loan if: (good credit AND has job) OR (high income)
can_get_loan = (credit_score >= 700 and has_job) or income > 100000
print(can_get_loan)           # True

Short-circuit evaluation: Python stops evaluating as soon as the result is determined:

# Short-circuit evaluation
x = 5
# In "False and ...", the second part is never evaluated
result = False and (x / 0)    # No ZeroDivisionError!

# In "True or ...", the second part is never evaluated  
result = True or (x / 0)      # No ZeroDivisionError!

# Practical use: safe access
user = None
# Only access .name if user is not None
name = user and user.name     # Returns None (doesn't crash)

AND Truth Table

T and T	T
T and F	F
F and T	F
F and F	F

OR Truth Table

T or T	T
T or F	T
F or T	T
F or F	F

NOT Truth Table

not T	F
not F	T

Assignment Operators

Assignment operators assign values to variables. Python has shorthand operators that combine assignment with arithmetic.

Operator	Example	Equivalent To	Description
`=`	`x = 5`	-	Simple assignment
`+=`	`x += 3`	`x = x + 3`	Add and assign
`-=`	`x -= 2`	`x = x - 2`	Subtract and assign
`*=`	`x *= 4`	`x = x * 4`	Multiply and assign
`/=`	`x /= 2`	`x = x / 2`	Divide and assign
`//=`	`x //= 3`	`x = x // 3`	Floor divide and assign
`%=`	`x %= 3`	`x = x % 3`	Modulo and assign
`**=`	`x **= 2`	`x = x ** 2`	Power and assign

# Assignment operators
score = 100

score += 10        # score = 110
score -= 5         # score = 105
score *= 2         # score = 210
score //= 3        # score = 70

print(score)       # 70

# Multiple assignment
a = b = c = 0      # All set to 0

# Unpacking assignment
x, y, z = 1, 2, 3  # x=1, y=2, z=3

# Swap values (Python magic!)
a, b = 10, 20
a, b = b, a        # Now a=20, b=10 (no temp variable needed!)

# Extended unpacking (Python 3+)
first, *rest = [1, 2, 3, 4, 5]
print(first)       # 1
print(rest)        # [2, 3, 4, 5]

head, *middle, tail = [1, 2, 3, 4, 5]
print(head)        # 1
print(middle)      # [2, 3, 4]
print(tail)        # 5

Identity & Membership Operators

These special operators check object identity and membership in sequences.

Identity: `is` / `is not`

Checks if two variables point to the same object in memory.

# Identity operators
a = [1, 2, 3]
b = [1, 2, 3]
c = a

# == checks value equality
print(a == b)       # True (same values)

# is checks identity (same object)
print(a is b)       # False (different objects)
print(a is c)       # True (same object)

# Use 'is' for None, True, False
x = None
print(x is None)    # True (correct way)
print(x == None)    # True (works but not recommended)

Membership: `in` / `not in`

Checks if a value exists in a sequence (string, list, tuple, dict).

# Membership operators
fruits = ["apple", "banana", "cherry"]

print("banana" in fruits)      # True
print("grape" in fruits)       # False
print("grape" not in fruits)   # True

# Works with strings
text = "Hello, World!"
print("World" in text)         # True
print("world" in text)         # False (case-sensitive)

# Works with dictionaries (checks keys)
person = {"name": "Priya", "age": 25}
print("name" in person)        # True (key exists)
print("Priya" in person)       # False (checks keys, not values)

Common Mistake: Use == to compare values, is to check identity. Only use is with None, True, False, or when you specifically need to check if two variables reference the exact same object.

Practice Questions: Operators

Test your understanding with these specific challenges.

Given:

total_minutes = 185

Task: Convert to hours and remaining minutes using floor division (//) and modulo (%).

Expected output: 3 hours and 5 minutes

Show Solution

total_minutes = 185
hours = total_minutes // 60
remaining_minutes = total_minutes % 60
print(f"{hours} hours and {remaining_minutes} minutes")  # 3 hours and 5 minutes

Given:

number = 47

Task: Write an expression that evaluates to True if the number is odd, False if even.

Expected output: True

Show Solution

number = 47
is_odd = number % 2 != 0
print(is_odd)  # True

Given:

score = 85
min_pass = 60
max_score = 100

Task: Write a single expression using comparison and logical operators to check if score is between min_pass and max_score (inclusive).

Expected output: True

Show Solution

score = 85
min_pass = 60
max_score = 100

# Method 1: Using chained comparison (Python special)
in_range = min_pass <= score <= max_score
print(in_range)  # True

# Method 2: Using logical and
in_range_v2 = score >= min_pass and score <= max_score
print(in_range_v2)  # True

Given:

base_price = 120.00
discount_percent = 15
tax_percent = 8

Task: Calculate the final price by applying the discount first, then adding tax. Round to 2 decimal places.

Expected output: 110.16

Show Solution

base_price = 120.00
discount_percent = 15
tax_percent = 8

discounted = base_price * (1 - discount_percent / 100)
final = discounted * (1 + tax_percent / 100)
print(round(final, 2))  # 110.16

Given:

a = 12  # binary: 1100
b = 10  # binary: 1010

Task: Calculate the following bitwise operations:

a & b (AND)
a | b (OR)
a ^ b (XOR)

Expected outputs: 8, 14, 6

Show Solution

a = 12  # binary: 1100
b = 10  # binary: 1010

# AND: bits that are 1 in BOTH
print(a & b)  # 8 (binary: 1000)

# OR: bits that are 1 in EITHER
print(a | b)  # 14 (binary: 1110)

# XOR: bits that are different
print(a ^ b)  # 6 (binary: 0110)

Control Flow

Control flow determines the order in which code executes. With conditional statements and loops, you can make decisions and repeat actions. This is the foundation of all programming logic.

alt="Control Flow Diagrams showing If-Else decision flow and For Loop iteration" class="figure-img img-fluid rounded shadow-lg">

Visual flowcharts: If-Else branching (left) and For Loop iteration (right)

What is Control Flow?

By default, Python executes code from top to bottom, one line at a time. But real programs need to make decisions ("if the user is logged in, show their profile") and repeat actions ("check each item in the shopping cart"). Control flow statements let you change the order of execution, branching to different code paths or looping through code multiple times.

Two Types of Control Flow

1. Conditional Statements - Make decisions. "If this condition is true, do X; otherwise, do Y." Examples: if, elif, else.

2. Loops - Repeat actions. "Do this task 10 times" or "do this for every item in the list." Examples: for, while.

Conditional Statements: if, elif, else

Conditional statements let your program make decisions based on conditions. Think of them like a flowchart: "Is this true? If yes, go this way. If no, go that way." Python uses if, elif (short for "else if"), and else.

The condition after if must evaluate to either True or False. If True, the indented code block runs. If False, Python skips to the next elif or else.

# Basic if statement
age = 20

if age >= 18:
    print("You are an adult")
    print("You can vote!")
# Output: You are an adult
#         You can vote!

# if-else statement
temperature = 15

if temperature >= 25:
    print("It's warm outside")
else:
    print("It's cold outside")
# Output: It's cold outside

# if-elif-else chain (multiple conditions)
score = 85

if score >= 90:
    grade = "A"
elif score >= 80:
    grade = "B"
elif score >= 70:
    grade = "C"
elif score >= 60:
    grade = "D"
else:
    grade = "F"

print(f"Your grade is: {grade}")  # Your grade is: B

Python uses indentation! Unlike other languages that use braces {}, Python uses indentation (typically 4 spaces) to define code blocks. This is not optional. Incorrect indentation causes errors. The colon : at the end of an if statement tells Python that an indented block follows.

Nested & Combined Conditions

You can place if statements inside other if statements. This is called nesting:

# Nested if statements
age = 25
has_ticket = True

if age >= 18:
    if has_ticket:
        print("Welcome to the movie!")
    else:
        print("Please buy a ticket first")
else:
    print("Sorry, this movie is for adults only")

Better approach: Use logical operators (and, or) to combine conditions instead of deep nesting:

# Better: combined with logical operators
if age >= 18 and has_ticket:
    print("Welcome to the movie!")
elif age >= 18:
    print("Please buy a ticket first")
else:
    print("Sorry, this movie is for adults only")

Complex conditions: Use parentheses for clarity when mixing and and or:

# Complex condition example
income = 60000
credit_score = 750
has_collateral = True

if (income >= 50000 and credit_score >= 700) or has_collateral:
    print("Loan approved!")
else:
    print("Loan denied")

Ternary Operator (Conditional Expression)

Python's ternary operator lets you write simple if-else in one line:

# Ternary operator syntax: value_if_true if condition else value_if_false

age = 20
status = "adult" if age >= 18 else "minor"
print(status)  # adult

Compare this to the traditional multi-line approach:

# Traditional way (3 lines):
if age >= 18:
    status = "adult"
else:
    status = "minor"

More ternary examples - great for simple conditions:

# More examples
x = 10
result = "positive" if x > 0 else "non-positive"

# Great for default values
user_name = input_name if input_name else "Anonymous"

Avoid nested ternaries! While possible, nested ternary expressions are hard to read. Use regular if-elif-else for complex conditions instead.

For Loops

for loops iterate over sequences (lists, strings, ranges, etc.). They're the most common loop type in Python, especially for data processing. The basic idea: "for each item in this collection, do something."

How For Loops Work: Python takes each item from the sequence one at a time, assigns it to the loop variable, and runs the indented code block. When all items are processed, the loop ends and Python continues with the code after the loop.

# Iterating over a list
fruits = ["apple", "banana", "cherry"]
for fruit in fruits:
    print(fruit)
# Output: apple
#         banana
#         cherry

You can also loop through strings character by character:

# Iterating over a string (character by character)
for char in "Python":
    print(char, end=" ")
# Output: P y t h o n

The range() function generates numeric sequences:

# Using range() for numeric sequences
# range(stop) generates numbers from 0 to stop-1
for i in range(5):           # 0, 1, 2, 3, 4
    print(i, end=" ")
# Output: 0 1 2 3 4

Use range(start, stop, step) for more control:

# range(start, stop, step) - more control
for i in range(1, 10, 2):    # 1, 3, 5, 7, 9 (start at 1, go up to 10, step by 2)
    print(i, end=" ")
# Output: 1 3 5 7 9

# Counting backwards
for i in range(10, 0, -1):   # 10, 9, 8, ... 1
    print(i, end=" ")
# Output: 10 9 8 7 6 5 4 3 2 1

enumerate() and zip()

These built-in functions make loops more powerful and Pythonic.

enumerate() gives you both the index and value:

# enumerate() - get index AND value
fruits = ["apple", "banana", "cherry"]

# Without enumerate (not recommended)
for i in range(len(fruits)):
    print(f"{i}: {fruits[i]}")

# With enumerate (Pythonic!)
for index, fruit in enumerate(fruits):
    print(f"{index}: {fruit}")
# Output: 0: apple
#         1: banana
#         2: cherry

You can start the index from any number:

# Start index from 1
for index, fruit in enumerate(fruits, start=1):
    print(f"{index}. {fruit}")
# Output: 1. apple
#         2. banana
#         3. cherry

zip() iterates over multiple sequences together:

# zip() - iterate over multiple sequences together
names = ["Priya", "Rahul", "Ankit"]
ages = [25, 30, 35]
cities = ["Mumbai", "Delhi", "Bangalore"]

for name, age, city in zip(names, ages, cities):
    print(f"{name} is {age} years old and lives in {city}")
# Output: Priya is 25 years old and lives in Mumbai
#         Rahul is 30 years old and lives in Delhi
#         Ankit is 35 years old and lives in Bangalore

Combine enumerate() and zip() for even more control:

# Combine enumerate and zip
for i, (name, age) in enumerate(zip(names, ages)):
    print(f"{i}: {name} - {age}")

While Loops

while loops repeat as long as a condition is True. Be careful: if the condition never becomes False, you'll create an infinite loop!

# Basic while loop
count = 0
while count < 5:
    print(count)
    count += 1      # Don't forget to update the condition!
# Output: 0 1 2 3 4

While loop with user input:

# While loop with user input
password = ""
while password != "secret123":
    password = input("Enter password: ")
print("Access granted!")

Infinite loop with break: Use while True with a break condition:

# Infinite loop with break
while True:
    user_input = input("Type 'quit' to exit: ")
    if user_input.lower() == "quit":
        break
    print(f"You typed: {user_input}")

Common pattern: Read data until a sentinel value:

# Common pattern: read until end of data
data = []
while True:
    value = input("Enter value (or 'done'): ")
    if value == "done":
        break
    data.append(value)
print(f"Collected: {data}")

Loop Control: break, continue, pass

break

Exits the loop immediately, skipping all remaining iterations.

# Find first even number
for num in [1, 3, 5, 4, 7]:
    if num % 2 == 0:
        print(f"Found: {num}")
        break
# Output: Found: 4

continue

Skips the current iteration and moves to the next one.

# Skip negative numbers
for num in [1, -2, 3, -4, 5]:
    if num < 0:
        continue
    print(num)
# Output: 1 3 5

pass

Does nothing. A placeholder for future code.

# Placeholder for future logic
for item in items:
    if condition:
        pass  # TODO: implement
    else:
        process(item)

# Practical example: processing data with break and continue
scores = [85, 92, -1, 78, 95, 88]  # -1 indicates end of valid data

valid_scores = []
for score in scores:
    # Skip invalid scores
    if score < 0:
        break  # Stop processing when we hit -1
    
    # Skip scores below threshold
    if score < 80:
        continue
    
    valid_scores.append(score)

print(f"Valid high scores: {valid_scores}")  # [85, 92]

The else Clause with Loops

Python has a unique feature: loops can have an else clause that runs only if the loop completes without hitting break.

# else runs if loop completes normally (no break)
for num in [1, 3, 5, 7, 9]:
    if num % 2 == 0:
        print(f"Found even: {num}")
        break
else:
    print("No even numbers found!")
# Output: No even numbers found!

# Practical use: search with else
def find_user(users, target_id):
    for user in users:
        if user["id"] == target_id:
            print(f"Found: {user['name']}")
            break
    else:
        print("User not found!")

users = [{"id": 1, "name": "Priya"}, {"id": 2, "name": "Rahul"}]
find_user(users, 3)  # User not found!

List Comprehensions

List comprehensions are a concise, Pythonic way to create lists. They combine a loop and optional condition into a single line.

# Basic syntax: [expression for item in iterable]

# Traditional way
squares = []
for x in range(10):
    squares.append(x ** 2)

# List comprehension way (one line!)
squares = [x ** 2 for x in range(10)]
print(squares)  # [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

# With condition: [expression for item in iterable if condition]
evens = [x for x in range(20) if x % 2 == 0]
print(evens)    # [0, 2, 4, 6, 8, 10, 12, 14, 16, 18]

# Transform and filter
names = ["Priya", "Rahul", "Ankit", "Vikram"]
long_names_upper = [name.upper() for name in names if len(name) > 4]
print(long_names_upper)  # ['PRIYA', 'ANKIT', 'VIKRAM']

# if-else in comprehension (note different position!)
numbers = [1, 2, 3, 4, 5]
labels = ["even" if n % 2 == 0 else "odd" for n in numbers]
print(labels)  # ['odd', 'even', 'odd', 'even', 'odd']

# Nested loops in comprehension
matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
flat = [num for row in matrix for num in row]
print(flat)    # [1, 2, 3, 4, 5, 6, 7, 8, 9]

# Create a matrix
grid = [[i * j for j in range(1, 4)] for i in range(1, 4)]
print(grid)    # [[1, 2, 3], [2, 4, 6], [3, 6, 9]]

Data Science Tip: List comprehensions are heavily used in data science for data transformation. However, for large datasets, NumPy and pandas operations are much faster!

Practice Questions: Control Flow

Test your understanding with these specific challenges.

Given:

score = 78

Task: Print the letter grade using if-elif-else:

90-100: A
80-89: B
70-79: C
60-69: D
Below 60: F

Expected output: C

Show Solution

score = 78

if score >= 90:
    grade = "A"
elif score >= 80:
    grade = "B"
elif score >= 70:
    grade = "C"
elif score >= 60:
    grade = "D"
else:
    grade = "F"

print(grade)  # C

Task: Using a for loop, calculate the sum of all even numbers from 1 to 20 (inclusive).

Expected output: 110

Show Solution

total = 0
for num in range(1, 21):
    if num % 2 == 0:
        total += num

print(total)  # 110

# Alternative: using range step
total_v2 = sum(range(2, 21, 2))
print(total_v2)  # 110

Task: For numbers 1 to 15, print:

"FizzBuzz" if divisible by both 3 and 5
"Fizz" if divisible by 3 only
"Buzz" if divisible by 5 only
The number itself otherwise

Expected output: 1, 2, Fizz, 4, Buzz, Fizz, 7, 8, Fizz, Buzz, 11, Fizz, 13, 14, FizzBuzz

Show Solution

result = []
for num in range(1, 16):
    if num % 3 == 0 and num % 5 == 0:
        result.append("FizzBuzz")
    elif num % 3 == 0:
        result.append("Fizz")
    elif num % 5 == 0:
        result.append("Buzz")
    else:
        result.append(str(num))

print(", ".join(result))
# 1, 2, Fizz, 4, Buzz, Fizz, 7, 8, Fizz, Buzz, 11, Fizz, 13, 14, FizzBuzz

Given:

text = "Hello World"

Task: Count the number of vowels (a, e, i, o, u) in the string. Use a for loop.

Expected output: 3

Show Solution

text = "Hello World"
vowels = "aeiouAEIOU"
count = 0

for char in text:
    if char in vowels:
        count += 1

print(count)  # 3

Task: Find all prime numbers between 1 and 30 using nested loops.

Expected output: [2, 3, 5, 7, 11, 13, 17, 19, 23, 29]

Hint: A prime number is only divisible by 1 and itself. Use break to exit early if you find a divisor.

Show Solution

primes = []

for num in range(2, 31):
    is_prime = True
    for i in range(2, num):
        if num % i == 0:
            is_prime = False
            break
    if is_prime:
        primes.append(num)

print(primes)
# [2, 3, 5, 7, 11, 13, 17, 19, 23, 29]

Functions

Functions are reusable blocks of code that perform specific tasks. They're essential for writing clean, organized, and maintainable code. Let's master Python functions!

What is a Function?

A function is like a recipe: you give it a name, define the steps, and whenever you need to perform that task, you just call the function by name. Instead of writing the same code over and over, you write it once in a function and reuse it as many times as you need.

Why Use Functions?

1. Reusability - Write once, use many times.
2. Organization - Break complex problems into smaller pieces.
3. Readability - calculate_tax(price) is clearer than 10 lines of tax calculation code.
4. Testing - Easy to test small, focused functions.

Defining Functions

Use the def keyword (short for "define") to create a function. The function name should describe what the function does, following Python's naming conventions (lowercase letters with underscores separating words, like calculate_average).

The structure is: def function_name(parameters): followed by an indented code block. Parameters are optional. Some functions don't need any input.

# Basic function definition
def greet():
    """A simple greeting function."""
    print("Hello, World!")

# Call the function (run it) by using its name with parentheses
greet()  # Output: Hello, World!

Function with parameters: Pass data into your function:

# Function with parameters (input values)
def greet_person(name):
    """Greet a specific person."""
    print(f"Hello, {name}!")

greet_person("Priya")  # Output: Hello, Priya!

Return values: Use return to send data back from a function:

# Function with return value (output)
def add(a, b):
    """Add two numbers and return the result."""
    return a + b

result = add(5, 3)
print(result)  # 8

Multiple return values: Python can return multiple values as a tuple:

# Multiple return values (returns a tuple)
def get_stats(numbers):
    """Calculate min, max, and average."""
    return min(numbers), max(numbers), sum(numbers) / len(numbers)

minimum, maximum, average = get_stats([1, 2, 3, 4, 5])
print(f"Min: {minimum}, Max: {maximum}, Avg: {average}")
# Output: Min: 1, Max: 5, Avg: 3.0

Docstrings: The triple-quoted string right after def is called a docstring (documentation string). It explains what the function does. This is a Python best practice. You can access it with help(function) or function.__doc__.

Parameters and Arguments

Parameters are the variable names in the function definition. Arguments are the actual values you pass when calling the function. Python offers flexible ways to pass data to functions: positional, keyword, default values, and variable-length arguments.

Default Parameter Values

Give parameters default values to make them optional:

# Default parameter values
def greet(name, greeting="Hello"):
    """Greet with customizable greeting."""
    print(f"{greeting}, {name}!")

greet("Priya")              # Hello, Priya! (uses default)
greet("Rahul", "Hi")          # Hi, Rahul! (overrides default)
greet("Ankit", greeting="Hey")  # Hey, Ankit! (keyword argument)

Functions can have multiple parameters with default values:

# Multiple defaults
def create_user(name, age=0, city="Unknown", active=True):
    return {
        "name": name,
        "age": age,
        "city": city,
        "active": active
    }

user1 = create_user("Priya")
user2 = create_user("Rahul", 25)
user3 = create_user("Ankit", city="Mumbai", age=30)

Warning: Never use mutable objects (lists, dicts) as default values! They're shared across all calls.

# BAD - mutable default
def add_item(item, items=[]):
    items.append(item)
    return items

print(add_item("a"))  # ['a']
print(add_item("b"))  # ['a', 'b'] - Unexpected!

# GOOD - use None and create inside
def add_item(item, items=None):
    if items is None:
        items = []
    items.append(item)
    return items

*args and **kwargs

These special syntaxes allow functions to accept any number of arguments:

*args (Positional)

Collects extra positional arguments into a tuple.

# *args - variable positional arguments
def sum_all(*numbers):
    """Sum any number of values."""
    total = 0
    for num in numbers:
        total += num
    return total

print(sum_all(1, 2, 3))        # 6
print(sum_all(1, 2, 3, 4, 5))  # 15
print(sum_all())               # 0

# Mixed with regular parameters
def greet_all(greeting, *names):
    for name in names:
        print(f"{greeting}, {name}!")

greet_all("Hello", "Priya", "Rahul", "Ankit")
# Hello, Priya!
# Hello, Rahul!
# Hello, Ankit!

**kwargs (Keyword)

Collects extra keyword arguments into a dictionary.

# **kwargs - variable keyword arguments
def print_info(**info):
    """Print all key-value pairs."""
    for key, value in info.items():
        print(f"{key}: {value}")

print_info(name="Priya", age=25, city="Mumbai")
# name: Priya
# age: 25
# city: Mumbai

# Mixed with regular and *args
def create_record(id, *tags, **attributes):
    return {
        "id": id,
        "tags": tags,
        "attributes": attributes
    }

record = create_record(1, "python", "data",
                       author="Priya", status="active")
# {'id': 1, 'tags': ('python', 'data'), 
#  'attributes': {'author': 'Priya', 'status': 'active'}}

Unpacking Arguments

# Unpack a list/tuple into positional arguments
def add(a, b, c):
    return a + b + c

numbers = [1, 2, 3]
print(add(*numbers))          # 6 (unpacks list into a=1, b=2, c=3)

# Unpack a dictionary into keyword arguments
def greet(name, greeting, punctuation):
    return f"{greeting}, {name}{punctuation}"

params = {"name": "Priya", "greeting": "Hello", "punctuation": "!"}
print(greet(**params))        # Hello, Priya!

# Combining unpacking
coords = (10, 20)
options = {"color": "red", "size": 5}
draw_point(*coords, **options)  # draw_point(10, 20, color="red", size=5)

Lambda Functions

Lambda functions are small, anonymous functions defined with the lambda keyword. They're useful for short operations, especially with map(), filter(), and sorted().

# Lambda syntax: lambda arguments: expression

# Regular function
def square(x):
    return x ** 2

# Equivalent lambda
square = lambda x: x ** 2

print(square(5))  # 25

# Multiple arguments
add = lambda a, b: a + b
print(add(3, 4))  # 7

Common use case: Sorting with custom keys:

# sorted() with custom key
students = [
    {"name": "Priya", "grade": 85},
    {"name": "Rahul", "grade": 92},
    {"name": "Ankit", "grade": 78}
]

# Sort by grade
sorted_by_grade = sorted(students, key=lambda s: s["grade"])
print([s["name"] for s in sorted_by_grade])  # ['Ankit', 'Priya', 'Rahul']

# Sort by grade descending
sorted_desc = sorted(students, key=lambda s: s["grade"], reverse=True)
print([s["name"] for s in sorted_desc])  # ['Rahul', 'Priya', 'Ankit']

map() applies a function to each element:

# map() - apply function to each element
numbers = [1, 2, 3, 4, 5]
squared = list(map(lambda x: x ** 2, numbers))
print(squared)  # [1, 4, 9, 16, 25]

filter() keeps elements that satisfy a condition:

# filter() - keep elements that satisfy condition
evens = list(filter(lambda x: x % 2 == 0, numbers))
print(evens)  # [2, 4]

List comprehensions are often cleaner than combining map and filter:

# Combine map and filter (complex)
result = list(map(lambda x: x ** 2, filter(lambda x: x % 2 == 0, range(10))))
print(result)  # [0, 4, 16, 36, 64]

# List comprehension is often cleaner
result = [x ** 2 for x in range(10) if x % 2 == 0]
print(result)  # [0, 4, 16, 36, 64]

Variable Scope

Scope determines where a variable can be accessed. Python has local, enclosing, global, and built-in scopes (LEGB rule).

# Global scope - accessible everywhere
global_var = "I'm global"

def my_function():
    # Local scope - only accessible inside this function
    local_var = "I'm local"
    print(global_var)   # Can read global
    print(local_var)    # Can access local

my_function()
print(global_var)       # Works
# print(local_var)      # NameError: local_var is not defined

# Modifying global variables
counter = 0

def increment():
    global counter      # Declare we're using the global
    counter += 1

increment()
increment()
print(counter)          # 2

# Enclosing scope (nested functions)
def outer():
    outer_var = "outer"
    
    def inner():
        nonlocal outer_var  # Access enclosing scope
        outer_var = "modified by inner"
        print(outer_var)
    
    inner()
    print(outer_var)    # "modified by inner"

outer()

LEGB Rule (Scope Resolution Order)

Python searches for variables in this order:

L - Local Inside current function

E - Enclosing Inside enclosing functions

G - Global Module-level variables

B - Built-in Python built-ins (print, len, etc.)

# Best practice: avoid global, return values instead

# BAD - Using global (not recommended)
result = 0
def calculate_bad():
    global result
    result = 42

# GOOD - Return values (recommended)
def calculate_good():
    return 42

result = calculate_good()

# Example: Pure function (no side effects)
def calculate_tax(income, rate=0.25):
    """Calculate tax. Pure function - same input always gives same output."""
    return income * rate

tax = calculate_tax(50000)  # 12500.0

Practical Example: Data Processing Functions

Let's combine everything into a real data science example:

# Data processing functions for data science

def clean_data(data, remove_nulls=True, remove_duplicates=True):
    """
    Clean a list of dictionaries (records).
    
    Args:
        data: List of dictionaries
        remove_nulls: Remove records with None values
        remove_duplicates: Remove duplicate records
    
    Returns:
        Cleaned list of dictionaries
    """
    result = data.copy()
    
    if remove_nulls:
        result = [
            record for record in result 
            if all(v is not None for v in record.values())
        ]
    
    if remove_duplicates:
        seen = set()
        unique = []
        for record in result:
            # Convert dict to tuple for hashing
            key = tuple(sorted(record.items()))
            if key not in seen:
                seen.add(key)
                unique.append(record)
        result = unique
    
    return result


def calculate_statistics(values):
    """Calculate basic statistics for a list of numbers."""
    if not values:
        return None
    
    n = len(values)
    mean = sum(values) / n
    
    # Variance and standard deviation
    variance = sum((x - mean) ** 2 for x in values) / n
    std_dev = variance ** 0.5
    
    # Sort for median
    sorted_values = sorted(values)
    if n % 2 == 0:
        median = (sorted_values[n//2 - 1] + sorted_values[n//2]) / 2
    else:
        median = sorted_values[n//2]
    
    return {
        "count": n,
        "mean": mean,
        "median": median,
        "std_dev": std_dev,
        "min": min(values),
        "max": max(values)
    }


def filter_records(data, **conditions):
    """
    Filter records based on conditions.
    
    Example: filter_records(data, age=25, city="Mumbai")
    """
    return [
        record for record in data
        if all(record.get(key) == value for key, value in conditions.items())
    ]


# Using the functions
raw_data = [
    {"name": "Priya", "age": 25, "salary": 50000},
    {"name": "Rahul", "age": 30, "salary": 60000},
    {"name": "Ankit", "age": None, "salary": 55000},  # Will be removed
    {"name": "Priya", "age": 25, "salary": 50000},      # Duplicate
    {"name": "Vikram", "age": 35, "salary": 70000},
]

# Clean the data
cleaned = clean_data(raw_data)
print(f"Cleaned: {len(cleaned)} records")  # 3 records

# Calculate salary statistics
salaries = [r["salary"] for r in cleaned]
stats = calculate_statistics(salaries)
print(f"Average salary: ${stats['mean']:,.2f}")

# Filter high earners
high_earners = filter_records(cleaned, age=30)
print(f"Age 30: {[r['name'] for r in high_earners]}")

Next Steps: In upcoming modules, you'll learn how pandas and NumPy provide optimized versions of these operations that can handle millions of records efficiently!

Practice Questions: Functions

Test your understanding with these specific challenges.

Task: Write a function rectangle_area(length, width) that returns the area.

# Test:
print(rectangle_area(5, 3))  # Expected: 15
print(rectangle_area(10, 7)) # Expected: 70

Show Solution

def rectangle_area(length, width):
    return length * width

print(rectangle_area(5, 3))   # 15
print(rectangle_area(10, 7))  # 70

Task: Write a function is_palindrome(text) that returns True if the text reads the same forwards and backwards (ignore case).

# Test:
print(is_palindrome("Radar"))  # Expected: True
print(is_palindrome("Hello"))  # Expected: False
print(is_palindrome("Level"))  # Expected: True

Show Solution

def is_palindrome(text):
    text = text.lower()
    return text == text[::-1]

print(is_palindrome("Radar"))  # True
print(is_palindrome("Hello"))  # False
print(is_palindrome("Level"))  # True

Task: Write a function power(base, exponent=2) that returns base raised to the exponent. If no exponent is given, it defaults to 2 (square).

# Test:
print(power(5))      # Expected: 25 (5^2)
print(power(2, 10))  # Expected: 1024 (2^10)
print(power(3, 3))   # Expected: 27 (3^3)

Show Solution

def power(base, exponent=2):
    return base ** exponent

print(power(5))      # 25
print(power(2, 10))  # 1024
print(power(3, 3))   # 27

Task: Write a function min_max(numbers) that takes a list and returns both the minimum and maximum values as a tuple.

# Test:
data = [4, 2, 9, 1, 7]
minimum, maximum = min_max(data)
print(minimum)  # Expected: 1
print(maximum)  # Expected: 9

Show Solution

def min_max(numbers):
    return min(numbers), max(numbers)

data = [4, 2, 9, 1, 7]
minimum, maximum = min_max(data)
print(minimum)  # 1
print(maximum)  # 9

Task: Write a function sum_all(*args) that accepts any number of arguments and returns their sum.

# Test:
print(sum_all(1, 2, 3))        # Expected: 6
print(sum_all(10, 20))         # Expected: 30
print(sum_all(5, 5, 5, 5, 5))  # Expected: 25

Show Solution

def sum_all(*args):
    return sum(args)

print(sum_all(1, 2, 3))        # 6
print(sum_all(10, 20))         # 30
print(sum_all(5, 5, 5, 5, 5))  # 25

Task: Write a recursive function factorial(n) that calculates n! (n factorial = n × (n-1) × ... × 1).

# Test:
print(factorial(5))  # Expected: 120 (5 × 4 × 3 × 2 × 1)
print(factorial(0))  # Expected: 1 (by definition)
print(factorial(7))  # Expected: 5040

Hint: Base case is when n ≤ 1, return 1. Otherwise return n × factorial(n-1)

Show Solution

def factorial(n):
    if n <= 1:
        return 1
    return n * factorial(n - 1)

print(factorial(5))  # 120
print(factorial(0))  # 1
print(factorial(7))  # 5040

Given:

temperatures_celsius = [0, 10, 20, 30, 40]

Task: Use lambda and map() to convert all temperatures to Fahrenheit. Formula: F = C × 9/5 + 32

Expected output: [32.0, 50.0, 68.0, 86.0, 104.0]

Show Solution

temperatures_celsius = [0, 10, 20, 30, 40]

temperatures_fahrenheit = list(map(
    lambda c: c * 9/5 + 32,
    temperatures_celsius
))

print(temperatures_fahrenheit)
# [32.0, 50.0, 68.0, 86.0, 104.0]

Key Takeaways

Python is King

Python dominates data science due to its simplicity, rich ecosystem (NumPy, pandas, scikit-learn), and huge community support

Dynamic Typing

Python is dynamically typed. Variables don't need type declarations. Use type() and isinstance() to check types

Operators Matter

Master arithmetic (+, **, //), comparison (==, !=), and logical (and, or) operators for data manipulation

Control Your Flow

Use if/elif/else for decisions, for loops for iteration, and while loops for conditional repetition

Functions are Reusable

Define functions with def, use parameters with defaults, and understand *args/**kwargs for flexibility

Scope Awareness

Variables have local or global scope. Use the global keyword carefully, and prefer returning values over modifying globals

What You'll Learn

Contents

Why Python for Data Science?

What is Programming?

Human Writes Code

Python Interprets

Computer Executes

Your First Python Program: Hello, World!

More Examples with print()

A Brief History of Python

Timeline

Why Python Dominates Data Science

Easy to Learn

Rich Ecosystem

Huge Community

Industry Adoption

Integration Power

Rapid Prototyping

Python vs Other Languages

Essential Data Science Libraries

NumPy

pandas

Matplotlib

Seaborn

Scikit-learn

TensorFlow/PyTorch

Variables & Data Types

What is a Variable?

Variable

Real-World Analogy

Assignment Operator

Variable Naming Rules & Conventions

Valid Names

Invalid Names

Core Data Types

Interactive: Explore Python Data Types

Integers (Whole Numbers)

Floating-Point Numbers (Decimals)

Strings (Text)

F-Strings (Formatted String Literals)

Booleans (True/False)

NoneType (Absence of Value)

Checking & Converting Types

Quick Reference: Python Data Types

Practice Questions: Variables & Data Types

Medium Convert weather API data between Celsius and Fahrenheit

Easy Parse e-commerce cart strings to calculate order totals

Hard Build a health metrics calculator for fitness tracking app

Easy Exchange player positions in a game leaderboard

Medium Format financial report figures for quarterly statements

Operators

What is an Operator?

Operands

Arithmetic Operators

Comparison Operators

Logical Operators

AND Truth Table

OR Truth Table

NOT Truth Table

Assignment Operators

Identity & Membership Operators

Identity: is / is not

Membership: in / not in

Practice Questions: Operators

Medium Convert podcast duration from minutes to hours display

Easy Validate lottery ticket numbers for alternating pattern

Hard Validate exam scores fall within acceptable passing bounds

Easy Apply Black Friday discounts and sales tax to cart

Medium Manipulate RGB color channel values using binary logic

Control Flow

What is Control Flow?

Two Types of Control Flow

Conditional Statements: if, elif, else

Nested & Combined Conditions

Ternary Operator (Conditional Expression)

For Loops

enumerate() and zip()

While Loops

Loop Control: break, continue, pass

Identity: `is` / `is not`

Membership: `in` / `not in`