Project 5: Text Editor with Undo/Redo | Java DSA Course

Project Overview

This capstone project challenges you to build a text editor with complete undo/redo functionality. You will work with real editing operation data containing 100+ editing events including insertions, deletions, undo/redo operations, and cursor movements. Your goal is to implement stack-based command pattern and understand memory management techniques that power professional text editors.

Skills Applied: This project tests your proficiency in Stacks (dual-stack undo/redo), Command Pattern (encapsulating operations as objects), Memory Management (efficient allocation and cleanup), and Design Patterns (Command, Memento for state snapshots).

Stack-based Commands

Command pattern with dual stacks

Undo/Redo System

Full operation history management

Memory Management

Efficient memory allocation & cleanup

Optimization

Performance tuning & state compression

Learning Objectives

Technical Skills

Implement Rope data structure with balanced trees
Build Gap Buffer for efficient local editing
Create Piece Table for append-only editing
Implement undo/redo using Command and Memento patterns
Apply KMP/Rabin-Karp for efficient text search

Problem-Solving Skills

Understand why String is inefficient for editors
Analyze time complexity of edit operations
Compare trade-offs between different approaches
Handle edge cases in text manipulation
Test with real VS Code editing data

Ready to submit? Already completed the project? Submit your work now!

Submit Now

Problem Scenario

CodeCraft IDE

You have been hired as a Core Engineer at CodeCraft IDE, a startup building a next-generation code editor to compete with VS Code and Sublime Text. The team has discovered that using Java's String class for document storage causes severe performance issues when editing large files - every keystroke becomes slow because strings are immutable and require O(n) copy operations.

"Our prototype crashes when opening files over 1MB. Simple typing has noticeable lag. We need you to implement proper data structures like VS Code uses. Can you build a text buffer that handles millions of characters with O(log n) edits and efficient undo/redo?"

Emily Chen, Lead Architect

Problems to Solve

Text Storage

Efficient insert/delete at any position
Handle files with millions of characters
O(log n) operations instead of O(n)
Memory-efficient for large documents

Undo/Redo

Unlimited undo history
Group related operations together
Efficient memory usage for history
Support redo after undo

Find & Replace

Fast pattern matching in large files
Find all occurrences efficiently
Replace single or all matches
Support regex patterns (bonus)

Cursor & Selection

Track cursor position efficiently
Support text selection ranges
Handle multi-cursor editing
Line/column navigation

Pro Tip: VS Code uses a Piece Table for its text buffer, Vim uses a Gap Buffer, and many editors use Rope. Each has different trade-offs - your job is to implement at least one and understand when to use each!

The Dataset

You will work with a dataset of real text editor operations. Download the CSV file containing editing events to test your implementation:

Dataset Download

Download the text editor operations dataset. We provide a curated sample, or you can explore the full keystroke dynamics dataset on Kaggle for extended testing.

Keystrokes Dataset (Kaggle) Sample Operations (CSV)

100 editing operations including INSERT, DELETE, UNDO, REDO, FIND, REPLACE across 10 documents

Original Data Source

This project uses editing patterns inspired by the Keystrokes Dataset from Kaggle. The dataset contains 136 million keystrokes capturing real typing sessions including press/release times, keystroke dynamics, and text entry behaviors from multiple participants.

View on Kaggle VS Code Text Buffer Blog

Dataset Info: 136M keystrokes | Typing sessions | Press/Release timestamps | Keystroke dynamics for text editor analysis

Dataset Schema

Each row represents an editing operation with the following columns:

Column	Type	Description
`operation_id`	String	Unique operation identifier (e.g., OP001)
`timestamp`	DateTime	When the operation occurred
`operation_type`	String	INSERT, DELETE, UNDO, REDO, FIND, REPLACE, etc.
`position`	Integer	Cursor position (0-indexed)
`length`	Integer	Number of characters affected
`content`	String	Text content (for INSERT/REPLACE operations)
`document_id`	String	Document being edited (e.g., DOC001)
`user_id`	String	User performing the operation
`session_id`	String	Editing session identifier
`undo_group`	Integer	Group ID for undo operations

Operations are distributed across the following types:

INSERT (45 operations)
DELETE (15 operations)
UNDO (12 operations)

REDO (5 operations)
FIND (8 operations)
REPLACE (6 operations)

CURSOR_MOVE (5 operations)
SELECT (2 operations)
CUT/COPY/PASTE (2 operations)

Parse operations and replay them to test your implementation:

Create an EditOperation class with type, position, length, and content fields
Load operations from CSV and iterate through them
Use switch statement to handle INSERT, DELETE, UNDO, REDO operations
Verify your text buffer produces correct output after replaying all operations

Sample Data Preview

Here is what the data looks like from text_editor_operations.csv:

Op ID	Type	Position	Length	Content	Doc ID
OP001	INSERT	0	13	Hello World!	DOC001
OP004	DELETE	5	6	-	DOC001
OP006	UNDO	0	0	-	DOC001

Dataset Stats: 100 operations | 10 documents | 5 users | Operations: INSERT, DELETE, UNDO, REDO, FIND, REPLACE

For Testing: Replay operations in order to verify your text buffer produces correct output

Data Quality Note: The dataset contains realistic editing patterns including rapid typing, bulk deletions, frequent undo/redo, and find-replace operations. Your implementation should handle these efficiently.

Project Requirements

Your project must include all of the following components. Structure your code with clear organization, documentation, and comprehensive testing.

Text Buffer Implementation (Choose 1+)

Implement at least ONE of the following data structures:

Rope: Binary tree where leaves contain strings, O(log n) operations
Gap Buffer: Array with movable gap at cursor, O(1) local edits
Piece Table: Append-only with piece descriptors, used by VS Code

Deliverable: TextBuffer interface with insert(), delete(), charAt(), substring(), length() methods.

Undo/Redo System

Implement complete undo/redo functionality:

Command Pattern: Encapsulate operations as objects
Undo Stack: Track all operations for reversal
Redo Stack: Track undone operations for replay
Group Operations: Batch related edits (e.g., word completion)

Deliverable: UndoManager class with undo(), redo(), beginGroup(), endGroup() methods.

Find & Replace

Implement efficient text search:

KMP Algorithm: O(n+m) pattern matching
Find All: Return all occurrences of pattern
Replace: Replace single or all occurrences
Case Sensitivity: Support case-insensitive search

Deliverable: SearchEngine class with find(), findAll(), replace(), replaceAll() methods.

Documentation and Analysis

Technical documentation:

Data Structure Analysis: Why you chose Rope/Gap Buffer/Piece Table
Complexity Analysis: Time/space for each operation
Comparison: Trade-offs between different approaches
Performance Tests: Benchmarks with large files

Deliverable: README.md with comprehensive documentation and analysis.

Algorithm Specifications

Implement the following algorithms with optimal time and space complexity. Each algorithm should be thoroughly tested and documented.

Command Pattern (Core)

Encapsulate each edit operation as a command object.

Create Command interface with execute() and undo() methods
Implement InsertCommand, DeleteCommand classes
Store command state for reversal (position, text, length)
Each command knows how to undo itself

Time: O(1) execute/undo Space: O(n) history

Dual Stack Undo/Redo (Core)

Use two stacks to manage operation history.

Undo Stack: Push commands after execution
Redo Stack: Push commands after undo
Clear redo stack on new operation
Handle stack underflow gracefully

Time: O(1) undo/redo Space: O(ops)

Memory Management (Important)

Optimize memory usage for large documents.

Limit undo history size (e.g., max 100 operations)
Compress consecutive similar operations
Use StringBuilder for efficient string manipulation
Implement lazy evaluation where possible

Memory: Bounded Trade-off: History vs Memory

Operation Grouping (Advanced)

Group related operations for better UX.

Batch rapid keystrokes into single undo unit
Use timestamps to detect typing sessions
Implement beginGroup() / endGroup()
Undo entire group as single operation

UX: Better Complexity: Medium

Data Structure Specifications

Implement the following data structures to support efficient text editing operations. Choose at least ONE text buffer implementation.

Rope (Recommended)

Binary tree where leaves store string fragments.

insert(pos, str) - O(log n)
delete(pos, len) - O(log n)
charAt(index) - O(log n)
substring(start, end) - O(log n + k)
concat(rope1, rope2) - O(log n)
split(pos) - O(log n)

Best For: Frequent concatenation and splitting operations.

Gap Buffer

Array with movable gap at cursor position.

insert(char) - O(1) at cursor
delete() - O(1) at cursor
moveCursor(pos) - O(gap movement)
charAt(index) - O(1)
toString() - O(n)

Best For: Sequential typing at cursor (Vim/Emacs use this).

Piece Table

Append-only buffers with piece descriptors (VS Code).

insert(pos, str) - O(pieces)
delete(pos, len) - O(pieces)
charAt(index) - O(pieces)
Original buffer: immutable file content
Add buffer: append-only new text
Pieces: (buffer, start, length) descriptors

Best For: Efficient undo (just remove pieces), large files.

UndoManager

Stack-based history management.

execute(Command cmd) - O(1)
undo() - O(1)
redo() - O(1)
canUndo() - O(1)
canRedo() - O(1)
beginGroup() / endGroup() - O(1)

Pattern: Command Pattern with two stacks (undo/redo).

Submission Guidelines

Submit your project via GitHub. Ensure your repository is public and follows the structure below.

Repository Structure

Required Folders

src/ - Source code files
src/commands/ - Command pattern classes
src/editor/ - Main editor class
test/ - JUnit test files
data/ - CSV dataset

Required Files

Command.java - Command interface
InsertCommand.java - Insert operation
DeleteCommand.java - Delete operation
UndoManager.java - Stack management
README.md - Documentation

README Requirements

Project overview and objectives
Command Pattern implementation details
Undo/Redo stack logic explanation
Memory management strategies used
Usage examples with sample outputs
Time and space complexity analysis

Submission Checklist

Command interface with execute/undo
InsertCommand and DeleteCommand classes
UndoManager with dual stacks
Memory optimization (history limit)
JUnit test suite with edge cases
Code properly documented

Submit Your Project

Grading Rubric

Your project will be evaluated on the following criteria. Maximum score is 500 points.

Category	Criteria	Points
Text Buffer	Rope OR Gap Buffer OR Piece Table (100 pts) Correct insert/delete operations (30 pts) Optimal time complexity (20 pts)	150
Undo/Redo	Command Pattern implementation (40 pts) Undo stack functionality (25 pts) Redo stack functionality (20 pts) Group operations support (15 pts)	100
Find & Replace	KMP or Rabin-Karp implementation (40 pts) Find all occurrences (20 pts) Replace single/all (20 pts) Case-insensitive option (10 pts)	90
Testing	Unit test coverage (40 pts) Edge case handling (30 pts) Operation replay from CSV (20 pts)	90
Documentation	README completeness (20 pts) Code documentation/Javadoc (15 pts) Complexity analysis (20 pts) Performance benchmarks (15 pts)	70
Total		500

Excellent

450-500

Outstanding implementation

Good

350-449

Meets all requirements

Needs Work

< 350

Missing key requirements

Text Editor with Undo/Redo

What You Will Build

Contents

Project Overview

Stack-based Commands

Undo/Redo System

Memory Management

Optimization

Learning Objectives

Technical Skills

Problem-Solving Skills

Problem Scenario

CodeCraft IDE

Problems to Solve

The Dataset

Dataset Download

Original Data Source

Dataset Schema

Sample Data Preview

Project Requirements

Text Buffer Implementation (Choose 1+)

Undo/Redo System

Find & Replace

Documentation and Analysis

Algorithm Specifications

Data Structure Specifications

Submission Guidelines

Required Folders

Required Files

README Requirements

Submission Checklist

Grading Rubric

Excellent

Good

Needs Work

Text Editor with Undo/Redo

What You Will Build

Contents

Project Overview

Stack-based Commands

Undo/Redo System

Memory Management

Optimization

Learning Objectives

Technical Skills

Problem-Solving Skills

Problem Scenario

CodeCraft IDE

Problems to Solve

The Dataset

Dataset Download

Original Data Source

Dataset Schema

1 text_editor_operations.csv - Editing Operations (100 records)

2 Operation Types & Distribution

3 For Text Editor Implementation

Sample Data Preview

Project Requirements

Text Buffer Implementation (Choose 1+)

Undo/Redo System

Find & Replace

Documentation and Analysis

Algorithm Specifications

Data Structure Specifications

Submission Guidelines

Required Folders

Required Files

README Requirements

Submission Checklist

Grading Rubric

Excellent

Good

Needs Work