Capstone Project 5

Image Processor

Build a professional image processing library using modern C++17/20 features. Implement filters, geometric transformations, edge detection algorithms, color space conversions, and multiple file format support. Create a complete image manipulation toolkit demonstrating advanced C++ techniques.

20-30 hours
Advanced
700 Points
What You Will Build
  • Image Loading & Saving (PNG, JPG, BMP)
  • Filter Pipeline (Blur, Sharpen, Emboss)
  • Edge Detection (Sobel, Canny)
  • Color Space Conversions
  • Geometric Transformations
  • Histogram Analysis
Contents
01

Project Overview

This advanced capstone project challenges you to build a professional image processing library from scratch. You will work with the Intel Image Classification Dataset from Kaggle containing 25,000 images across 6 categories (buildings, forest, glacier, mountain, sea, street) at 150x150 resolution for testing your image processing algorithms. The library must demonstrate proficiency in pixel manipulation, convolution filters, color theory, and geometric transformations using modern C++17/20 features.

Skills Applied: This project tests your proficiency in C++ templates, smart pointers, operator overloading, multithreading for parallel processing, memory management, and mathematical algorithms for image manipulation.
Image I/O

Load and save PNG, JPG, BMP with proper error handling

Filters

Convolution-based blur, sharpen, emboss, edge detection

Color Spaces

RGB, HSV, HSL, Grayscale conversions with accuracy

Transforms

Scale, rotate, flip, crop with interpolation

Learning Objectives

Technical Skills
  • Master 2D array manipulation and pixel-level operations
  • Implement convolution kernels for various filter effects
  • Build thread-safe parallel image processing pipelines
  • Design efficient memory management for large images
  • Create flexible template-based image container classes
Image Processing Skills
  • Understand color models and their mathematical relationships
  • Implement edge detection using Sobel and Canny algorithms
  • Apply geometric transformations with interpolation
  • Create histogram equalization for image enhancement
  • Build a composable filter pipeline architecture
Ready to submit? Already completed the project? Submit your work now!
Submit Now
02

Company Scenario

PixelPerfect Labs

You have been hired as a Software Engineer at PixelPerfect Labs, a computer vision startup developing image analysis tools for photographers and content creators. The company needs a high-performance, cross-platform image processing library that can be integrated into their desktop application suite. They require a library that's lightweight, fast, and doesn't depend on heavy frameworks like OpenCV.

"We need a lean, mean image processing library written in pure C++. It should handle common operations like filters, color adjustments, and transformations efficiently. Our users process thousands of images daily, so performance is critical. Can you build something that processes a 4K image in under 100ms?"

Sarah Chen, CTO

Technical Requirements

Performance Targets
  • Process 1920x1080 image in < 50ms for basic filters
  • Edge detection on 4K image in < 200ms
  • Memory usage < 4x image size during processing
  • Support for multi-threaded batch processing
Core Features
  • PNG, JPG, BMP file format support
  • At least 10 different filter effects
  • Geometric transformations with interpolation
  • Color space conversions (RGB, HSV, HSL)
Architecture
  • Header-only or static library option
  • Template-based for different pixel formats
  • RAII for all resource management
  • Exception-safe operations
Quality
  • Unit tests with > 80% code coverage
  • Comprehensive documentation
  • Example programs demonstrating features
  • Cross-platform (Windows, Linux, macOS)
Pro Tip: Design your library with a clean API! Users should be able to chain operations fluently like: image.blur(5).grayscale().rotate(45).save("output.png")
03

Test Data & Benchmarks

Download the test suite and benchmark data to validate your image processing library implementation. The data includes performance targets, test scenarios, and expected outputs:

Test Suite Download

Download the image processing test data files containing filter benchmarks, transformation test cases, and color accuracy targets for validation.

Original Data Source

This project uses test images from the Intel Image Classification Dataset from Kaggle - containing 25,000 images of natural scenes (150x150 pixels) across 6 categories (buildings, forest, glacier, mountain, sea, street). These images are perfect for testing your image processing algorithms on real-world photography.

Dataset Info: 25,000 images × 150x150 pixels | 6 Categories: Buildings, Forest, Glacier, Mountain, Sea, Street | Train: 14k | Test: 3k | Prediction: 7k | Format: JPEG | Total Size: ~370MB
Test Data Schema

ColumnTypeDescription
test_idStringUnique test identifier (e.g., FLT_001, TRF_015)
categoryStringfilter, transform, color, edge, histogram
test_nameStringDescriptive test name
input_widthIntegerInput image width (64-4096)
input_heightIntegerInput image height (64-4096)
operationStringblur, sharpen, grayscale, rotate, etc.
parametersStringJSON parameters for the operation
expected_time_msFloatMaximum allowed processing time
accuracy_thresholdFloatMinimum accuracy percentage (PSNR/SSIM)
priorityStringcritical, high, medium, low

ColumnTypeDescription
filter_idStringUnique filter test ID
filter_typeStringblur, sharpen, emboss, edge_detect, etc.
kernel_sizeIntegerKernel size (3, 5, 7, 9)
image_sizeStringImage dimensions (e.g., 1920x1080)
target_fpsFloatMinimum frames per second
expected_output_hashStringMD5 hash of expected output
toleranceFloatPixel value tolerance (0.0-1.0)

ColumnTypeDescription
color_test_idStringUnique color test ID
source_spaceStringRGB, HSV, HSL, CMYK, Grayscale
target_spaceStringTarget color space
input_valuesStringJSON array of input color values
expected_valuesStringJSON array of expected output values
precisionIntegerDecimal precision required (2-6)
round_tripBooleanWhether round-trip accuracy is tested

ColumnTypeDescription
transform_test_idStringUnique transform test ID
operationStringscale, rotate, flip, crop, skew
input_dimensionsStringInput image size
parametersStringJSON parameters (angle, scale factor, etc.)
interpolationStringnearest, bilinear, bicubic
expected_dimensionsStringExpected output size
psnr_thresholdFloatMinimum PSNR value (dB)
Test Suite Stats: 300 test scenarios, 150 filter benchmarks, 100 color conversion tests, 100 transform accuracy tests
Performance Targets: 1080p filter <50ms, 4K edge detection <200ms, Color conversion <5ms per pixel
04

Project Requirements

Your image processing library must include all of the following systems. Structure your code with clean separation between core components and user-facing API.

1
Image I/O System

File Format Support:

  • PNG loading and saving with alpha channel support
  • JPEG loading and saving with quality settings
  • BMP loading and saving (24-bit and 32-bit)
  • Graceful error handling for corrupted files

Image Container:

  • Image<T> template class supporting various pixel types
  • Support for 8-bit, 16-bit, and floating-point channels
  • Efficient copy and move semantics
  • Iterator support for pixel-level operations
Deliverable: Image I/O module supporting PNG, JPG, BMP with proper error handling and RAII resource management.
2
Filter System

Convolution Filters:

  • Box blur with configurable kernel size
  • Gaussian blur with sigma parameter
  • Sharpen filter with intensity control
  • Emboss effect with direction options
  • Custom kernel support

Edge Detection:

  • Sobel operator (horizontal, vertical, combined)
  • Prewitt operator
  • Laplacian filter
  • Canny edge detector with thresholds
Deliverable: Filter pipeline supporting at least 10 different effects with composable operations.
3
Color Processing

Color Space Conversions:

  • RGB ↔ HSV conversion with accurate formulas
  • RGB ↔ HSL conversion
  • RGB → Grayscale (luminance-weighted)
  • Sepia tone effect
  • Color inversion (negative)

Color Adjustments:

  • Brightness adjustment
  • Contrast adjustment
  • Saturation control
  • Hue rotation
  • Gamma correction
Deliverable: Complete color processing module with accurate conversions and adjustable parameters.
4
Geometric Transformations

Basic Transformations:

  • Scale (up/down) with multiple interpolation methods
  • Rotation by arbitrary angle
  • Horizontal and vertical flip
  • Crop to region of interest

Interpolation Methods:

  • Nearest neighbor (fast)
  • Bilinear interpolation
  • Bicubic interpolation (high quality)
Deliverable: Transform system with selectable interpolation methods and smooth rotation at any angle.
05

Library Architecture

Design your library with a clean, modular architecture. The core should be template-based for flexibility, with a simple user-facing API for common operations.

Core Image Class
// Pixel type representing RGBA color
struct RGBA {
    uint8_t r, g, b, a;
    
    RGBA() : r(0), g(0), b(0), a(255) {}
    RGBA(uint8_t r, uint8_t g, uint8_t b, uint8_t a = 255) 
        : r(r), g(g), b(b), a(a) {}
    
    // Convert to grayscale using luminance weights
    uint8_t luminance() const {
        return static_cast<uint8_t>(0.299 * r + 0.587 * g + 0.114 * b);
    }
    
    // Blend with another color
    RGBA blend(const RGBA& other, float alpha) const {
        return RGBA(
            static_cast<uint8_t>(r * (1 - alpha) + other.r * alpha),
            static_cast<uint8_t>(g * (1 - alpha) + other.g * alpha),
            static_cast<uint8_t>(b * (1 - alpha) + other.b * alpha),
            static_cast<uint8_t>(a * (1 - alpha) + other.a * alpha)
        );
    }
};

template<typename PixelType = RGBA>
class Image {
public:
    Image() : width_(0), height_(0) {}
    Image(size_t width, size_t height) 
        : width_(width), height_(height), pixels_(width * height) {}
    
    // Load from file
    static Image load(const std::string& filename);
    
    // Save to file
    bool save(const std::string& filename) const;
    
    // Accessors
    size_t width() const { return width_; }
    size_t height() const { return height_; }
    bool empty() const { return pixels_.empty(); }
    
    // Pixel access
    PixelType& at(size_t x, size_t y) {
        return pixels_[y * width_ + x];
    }
    
    const PixelType& at(size_t x, size_t y) const {
        return pixels_[y * width_ + x];
    }
    
    // Safe access with bounds checking
    PixelType get(int x, int y, PixelType default_val = PixelType()) const {
        if (x < 0 || x >= static_cast<int>(width_) ||
            y < 0 || y >= static_cast<int>(height_)) {
            return default_val;
        }
        return at(x, y);
    }
    
    // Iterator support
    auto begin() { return pixels_.begin(); }
    auto end() { return pixels_.end(); }
    auto begin() const { return pixels_.cbegin(); }
    auto end() const { return pixels_.cend(); }
    
    // Raw data access
    PixelType* data() { return pixels_.data(); }
    const PixelType* data() const { return pixels_.data(); }
    
private:
    size_t width_, height_;
    std::vector<PixelType> pixels_;
};
Filter Pipeline
// Base filter interface
class IFilter {
public:
    virtual ~IFilter() = default;
    virtual Image<RGBA> apply(const Image<RGBA>& input) const = 0;
    virtual std::string name() const = 0;
};

// Convolution kernel base
class ConvolutionFilter : public IFilter {
public:
    ConvolutionFilter(std::vector<std::vector<float>> kernel)
        : kernel_(std::move(kernel)) {
        // Normalize kernel
        float sum = 0;
        for (const auto& row : kernel_) {
            for (float val : row) sum += val;
        }
        if (std::abs(sum) > 0.001f) {
            for (auto& row : kernel_) {
                for (float& val : row) val /= sum;
            }
        }
    }
    
    Image<RGBA> apply(const Image<RGBA>& input) const override {
        int kh = kernel_.size();
        int kw = kernel_[0].size();
        int pad_h = kh / 2;
        int pad_w = kw / 2;
        
        Image<RGBA> output(input.width(), input.height());
        
        // Parallel processing with OpenMP
        #pragma omp parallel for
        for (int y = 0; y < static_cast<int>(input.height()); ++y) {
            for (int x = 0; x < static_cast<int>(input.width()); ++x) {
                float r = 0, g = 0, b = 0;
                
                for (int ky = 0; ky < kh; ++ky) {
                    for (int kx = 0; kx < kw; ++kx) {
                        auto pixel = input.get(x + kx - pad_w, y + ky - pad_h);
                        float weight = kernel_[ky][kx];
                        r += pixel.r * weight;
                        g += pixel.g * weight;
                        b += pixel.b * weight;
                    }
                }
                
                output.at(x, y) = RGBA(
                    std::clamp(static_cast<int>(r), 0, 255),
                    std::clamp(static_cast<int>(g), 0, 255),
                    std::clamp(static_cast<int>(b), 0, 255),
                    input.at(x, y).a
                );
            }
        }
        
        return output;
    }
    
protected:
    std::vector<std::vector<float>> kernel_;
};

// Specific filter implementations
class GaussianBlur : public ConvolutionFilter {
public:
    GaussianBlur(int size = 5, float sigma = 1.0f)
        : ConvolutionFilter(createGaussianKernel(size, sigma)) {}
    
    std::string name() const override { return "Gaussian Blur"; }
    
private:
    static std::vector<std::vector<float>> createGaussianKernel(int size, float sigma) {
        std::vector<std::vector<float>> kernel(size, std::vector<float>(size));
        int center = size / 2;
        float sum = 0;
        
        for (int y = 0; y < size; ++y) {
            for (int x = 0; x < size; ++x) {
                int dx = x - center;
                int dy = y - center;
                kernel[y][x] = std::exp(-(dx*dx + dy*dy) / (2 * sigma * sigma));
                sum += kernel[y][x];
            }
        }
        
        // Normalize
        for (auto& row : kernel) {
            for (float& val : row) val /= sum;
        }
        
        return kernel;
    }
};
Performance Tip: Use OpenMP or std::thread for parallel pixel processing. Modern CPUs can process different image regions simultaneously, providing significant speedups for large images.
06

Filter Algorithms

Implement these filter algorithms with proper convolution. Each filter should produce visually correct results matching standard image processing tools.

Blur Filter Kernels
Box Blur (3×3)
// All weights equal
1/9 * [1 1 1]
      [1 1 1]
      [1 1 1]
Gaussian Blur (5×5)
// Center-weighted
1/256 * [1  4  6  4 1]
        [4 16 24 16 4]
        [6 24 36 24 6]
        [4 16 24 16 4]
        [1  4  6  4 1]
Motion Blur (5×5)
// Diagonal direction
1/5 * [1 0 0 0 0]
      [0 1 0 0 0]
      [0 0 1 0 0]
      [0 0 0 1 0]
      [0 0 0 0 1]
Edge Detection Kernels
Sobel Operator
// Horizontal (Gx)        Vertical (Gy)
[-1  0  1]              [-1 -2 -1]
[-2  0  2]              [ 0  0  0]
[-1  0  1]              [ 1  2  1]

// Magnitude: sqrt(Gx² + Gy²)
// Direction: atan2(Gy, Gx)
Laplacian Operator
// 4-connectivity        8-connectivity
[ 0  1  0]              [ 1  1  1]
[ 1 -4  1]              [ 1 -8  1]
[ 0  1  0]              [ 1  1  1]
Canny Edge Detection Algorithm
Image<uint8_t> cannyEdgeDetection(const Image<RGBA>& input, 
                                   float low_threshold, 
                                   float high_threshold) {
    // Step 1: Convert to grayscale
    auto gray = toGrayscale(input);
    
    // Step 2: Apply Gaussian blur to reduce noise
    auto blurred = GaussianBlur(5, 1.4f).apply(gray);
    
    // Step 3: Compute gradient magnitude and direction (Sobel)
    auto [magnitude, direction] = sobelGradient(blurred);
    
    // Step 4: Non-maximum suppression
    auto suppressed = nonMaximumSuppression(magnitude, direction);
    
    // Step 5: Double threshold
    auto edges = doubleThreshold(suppressed, low_threshold, high_threshold);
    
    // Step 6: Edge tracking by hysteresis
    return hysteresisTracking(edges);
}
Special Effect Kernels
Sharpen
[ 0 -1  0]
[-1  5 -1]
[ 0 -1  0]
Emboss
[-2 -1  0]
[-1  1  1]
[ 0  1  2]
Unsharp Mask
// Amount = 2.0
1/256 * [-1 -4  -6 -4 -1]
        [-4 -16 -24 -16 -4]
        [-6 -24 476 -24 -6]
        [-4 -16 -24 -16 -4]
        [-1 -4  -6 -4 -1]
07

Geometric Transformations

Implement geometric transformations with proper interpolation for high-quality results. The choice of interpolation method significantly affects output quality and performance.

Transformation Implementation
// Interpolation types
enum class Interpolation {
    Nearest,    // Fastest, pixelated results
    Bilinear,   // Good balance of quality and speed
    Bicubic     // Highest quality, slower
};

// Bilinear interpolation
RGBA bilinearInterpolate(const Image<RGBA>& img, float x, float y) {
    int x0 = static_cast<int>(std::floor(x));
    int y0 = static_cast<int>(std::floor(y));
    int x1 = x0 + 1;
    int y1 = y0 + 1;
    
    float fx = x - x0;
    float fy = y - y0;
    
    auto p00 = img.get(x0, y0);
    auto p10 = img.get(x1, y0);
    auto p01 = img.get(x0, y1);
    auto p11 = img.get(x1, y1);
    
    // Interpolate along x
    auto top = lerp(p00, p10, fx);
    auto bottom = lerp(p01, p11, fx);
    
    // Interpolate along y
    return lerp(top, bottom, fy);
}

// Rotation transform
Image<RGBA> rotate(const Image<RGBA>& input, float angle_degrees, 
                    Interpolation interp = Interpolation::Bilinear) {
    float angle = angle_degrees * M_PI / 180.0f;
    float cos_a = std::cos(angle);
    float sin_a = std::sin(angle);
    
    // Calculate new dimensions
    int w = input.width();
    int h = input.height();
    int new_w = static_cast<int>(std::abs(w * cos_a) + std::abs(h * sin_a));
    int new_h = static_cast<int>(std::abs(w * sin_a) + std::abs(h * cos_a));
    
    Image<RGBA> output(new_w, new_h);
    
    float cx = w / 2.0f;
    float cy = h / 2.0f;
    float ncx = new_w / 2.0f;
    float ncy = new_h / 2.0f;
    
    #pragma omp parallel for
    for (int y = 0; y < new_h; ++y) {
        for (int x = 0; x < new_w; ++x) {
            // Map from output to input coordinates (inverse mapping)
            float dx = x - ncx;
            float dy = y - ncy;
            
            float src_x = cos_a * dx + sin_a * dy + cx;
            float src_y = -sin_a * dx + cos_a * dy + cy;
            
            if (src_x >= 0 && src_x < w - 1 && src_y >= 0 && src_y < h - 1) {
                switch (interp) {
                    case Interpolation::Nearest:
                        output.at(x, y) = input.at(
                            static_cast<int>(std::round(src_x)),
                            static_cast<int>(std::round(src_y))
                        );
                        break;
                    case Interpolation::Bilinear:
                        output.at(x, y) = bilinearInterpolate(input, src_x, src_y);
                        break;
                    case Interpolation::Bicubic:
                        output.at(x, y) = bicubicInterpolate(input, src_x, src_y);
                        break;
                }
            }
        }
    }
    
    return output;
}

// Scale transform
Image<RGBA> scale(const Image<RGBA>& input, float scale_x, float scale_y,
                   Interpolation interp = Interpolation::Bilinear) {
    int new_w = static_cast<int>(input.width() * scale_x);
    int new_h = static_cast<int>(input.height() * scale_y);
    
    Image<RGBA> output(new_w, new_h);
    
    #pragma omp parallel for
    for (int y = 0; y < new_h; ++y) {
        for (int x = 0; x < new_w; ++x) {
            float src_x = x / scale_x;
            float src_y = y / scale_y;
            
            // Apply interpolation
            output.at(x, y) = interpolate(input, src_x, src_y, interp);
        }
    }
    
    return output;
}
08

Submission Requirements

Create a public GitHub repository with the exact name shown below:

Required Repository Name
cpp-image-processor
github.com/<your-username>/cpp-image-processor
Required Project Structure
cpp-image-processor/
├── include/
│   └── imgproc/
│       ├── image.hpp           # Core image class
│       ├── filters.hpp         # Filter implementations
│       ├── transforms.hpp      # Geometric transforms
│       ├── color.hpp           # Color space conversions
│       └── io.hpp              # File I/O operations
├── src/
│   ├── filters.cpp
│   ├── transforms.cpp
│   ├── color.cpp
│   └── io.cpp
├── tests/
│   ├── test_filters.cpp
│   ├── test_transforms.cpp
│   ├── test_color.cpp
│   └── test_io.cpp
├── examples/
│   ├── basic_filters.cpp
│   ├── edge_detection.cpp
│   └── batch_processing.cpp
├── docs/
│   └── API.md
├── CMakeLists.txt
└── README.md
Do Include
  • All required header and source files
  • CMake build configuration
  • Unit tests for all major components
  • Example programs demonstrating features
  • API documentation
  • Sample images for testing
Do Not Include
  • Build artifacts (*.o, *.exe, build/)
  • IDE-specific files (.vs/, .idea/)
  • Large image datasets (> 10MB total)
  • External libraries (use CMake FetchContent)
  • Temporary or backup files
Submit Your Project

Enter your GitHub username - we will verify your repository automatically

09

Grading Rubric

Your project will be graded on the following criteria. Total: 700 points.

Criteria Points Description
Image I/O 100 PNG, JPG, BMP support with proper error handling
Filter System 150 At least 10 filters including edge detection
Color Processing 100 Accurate color space conversions and adjustments
Transformations 100 Scale, rotate, flip with interpolation options
Performance 100 Meets timing requirements with parallel processing
Code Quality 75 Clean API, proper OOP, modern C++ practices
Testing & Docs 75 Unit tests, examples, and API documentation
Total 700
Grading Levels
Excellent
630-700

Exceeds all requirements

Good
525-629

Meets all requirements

Satisfactory
420-524

Meets minimum requirements

Needs Work
< 420

Missing key requirements

Ready to Submit?

Make sure you have completed all requirements and reviewed the grading rubric above.

Submit Your Project