Project Overview
This capstone project brings together everything you have learned in the Java DSA course. You will work with real stock market data containing 10,000+ daily price records spanning multiple years (2020-2024), covering 50 stocks across various sectors (Technology, Healthcare, Finance, Energy, Consumer). Your goal is to build a professional trading analysis system that applies data structures and algorithms to solve classic stock trading problems and optimize investment strategies.
Price Analysis
Arrays, sliding windows, prefix sums for trend analysis
Trading Optimization
Dynamic programming for maximum profit strategies
Data Structures
Heaps, stacks, hash maps for efficient operations
Performance
Algorithm complexity analysis and optimization
Learning Objectives
Technical Skills
- Master dynamic programming for multi-transaction trading
- Implement sliding window algorithms for trend detection
- Use monotonic stacks for stock span problems
- Build efficient hash maps for O(1) portfolio lookups
- Apply heaps for top K stock selection
Problem-Solving Skills
- Translate financial problems into algorithmic solutions
- Analyze time and space complexity trade-offs
- Optimize brute-force solutions with memoization
- Handle edge cases in trading scenarios
- Test algorithms with real-world data
Problem Scenario
AlgoTrade Capital
You have been hired as a Quantitative Developer at AlgoTrade Capital, an algorithmic trading firm that uses data structures and algorithms to make trading decisions. The firm manages portfolios across multiple sectors and needs efficient systems to analyze market data, identify optimal trading opportunities, and maximize returns while managing risk.
"We process millions of price points daily. We need algorithms that can determine the best times to buy and sell, calculate maximum profit strategies with transaction limits, and track stock performance efficiently. Can you build a trading system that solves these classic problems optimally?"
Problems to Solve
- Best time to buy and sell (one transaction)
- Maximum profit with single buy/sell
- Track minimum price seen so far
- O(n) time complexity required
- Unlimited buy/sell transactions
- Maximum profit with no transaction limit
- Cannot hold multiple stocks at once
- Greedy or DP approach
- Maximum K buy/sell transactions
- Dynamic programming with states
- Space optimization techniques
- Handle edge cases (k >= n/2)
- Trading with cooldown constraint
- Cannot buy immediately after sell
- State machine approach
- Transaction fee variations
The Dataset
You will work with a comprehensive stock price dataset. Download the CSV files containing historical price data for analysis:
Dataset Download
Download stock price data directly from Kaggle. The dataset contains individual CSV files for each stock ticker with historical OHLCV (Open, High, Low, Close, Volume) data.
Original Data Source
This project uses the Huge Stock Market Dataset from Kaggle - one of the most comprehensive historical stock price datasets available. It contains daily stock prices for all companies traded on NYSE, NASDAQ, and other US exchanges with over 7,000+ stocks spanning decades of market data.
Dataset Schema
Each stock has its own file named {ticker}.us.txt with the following columns:
| Column | Type | Description |
|---|---|---|
Date | Date | Trading date (YYYY-MM-DD format) |
Open | Decimal | Opening price for the trading day |
High | Decimal | Highest price reached during the day |
Low | Decimal | Lowest price reached during the day |
Close | Decimal | Closing price for the trading day |
Volume | Integer | Total number of shares traded |
OpenInt | Integer | Open interest (usually 0 for stocks) |
The Kaggle dataset is organized into folders:
Stocks/
├── aapl.us.txt # Apple Inc.
├── googl.us.txt # Alphabet Inc.
├── msft.us.txt # Microsoft
├── amzn.us.txt # Amazon
├── tsla.us.txt # Tesla
└── ... (7,000+ more)
Each file contains daily OHLCV data from the stock's IPO to 2017.
For LeetCode-style trading problems, extract the Close column as a price array:
// Example: Reading close prices from CSV
int[] prices = {7, 1, 5, 3, 6, 4}; // From Close column
// This becomes input for:
// - Best Time to Buy and Sell Stock I, II, III, IV
// - Stock with Cooldown
// - Stock with Transaction Fee
Sample Data Preview
Here is what the data looks like from aapl.us.txt (Apple stock):
| Date | Open | High | Low | Close | Volume | OpenInt |
|---|---|---|---|---|---|---|
| 2017-11-01 | 166.91 | 168.87 | 166.01 | 166.89 | 36,590,526 | 0 |
| 2017-11-02 | 166.60 | 168.50 | 165.28 | 168.11 | 33,610,580 | 0 |
| 2017-11-03 | 174.00 | 174.26 | 171.12 | 172.50 | 59,398,636 | 0 |
Project Requirements
Your project must include all of the following components. Structure your code with clear organization, documentation, and comprehensive testing.
Trading Algorithms (5 Required)
Implement the following classic LeetCode-style problems:
- Best Time to Buy and Sell Stock I: Single transaction, maximum profit
- Best Time to Buy and Sell Stock II: Unlimited transactions
- Best Time to Buy and Sell Stock III: At most 2 transactions
- Best Time to Buy and Sell Stock IV: At most K transactions
- Best Time with Cooldown: Must wait 1 day after selling
Data Structures Implementation
Build the following custom data structures:
- StockPriceTracker: Hash map + min/max heaps for real-time tracking
- StockSpan: Monotonic stack for calculating price spans
- Portfolio: Hash map with O(1) operations for holdings
- TopKStocks: Min heap for tracking top K performers
Testing and Validation
Comprehensive test coverage:
- Unit Tests: JUnit tests for each algorithm and data structure
- Edge Cases: Empty arrays, single element, all same prices, decreasing prices
- Performance Tests: Verify O(n) complexity with large inputs
- Integration Tests: Test with actual CSV data
Documentation and Analysis
Technical documentation:
- Algorithm Explanations: Describe approach for each trading problem
- Complexity Analysis: Time and space complexity for each solution
- Trade-offs: Compare different approaches (brute force vs DP)
- Results: Performance on provided dataset
Algorithm Specifications
Implement the following algorithms with optimal time and space complexity. Each algorithm should be thoroughly tested and documented.
Find maximum profit with one buy and one sell.
// Example:
// prices = [7, 1, 5, 3, 6, 4]
// Output: 5 (buy at 1, sell at 6)
public int maxProfit(int[] prices) {
int minPrice = Integer.MAX_VALUE;
int maxProfit = 0;
for (int price : prices) {
minPrice = Math.min(minPrice, price);
maxProfit = Math.max(maxProfit, price - minPrice);
}
return maxProfit;
}
Find maximum profit with unlimited transactions.
// Example:
// prices = [7, 1, 5, 3, 6, 4]
// Output: 7 (buy@1 sell@5, buy@3 sell@6)
public int maxProfit(int[] prices) {
int profit = 0;
for (int i = 1; i < prices.length; i++) {
if (prices[i] > prices[i - 1]) {
profit += prices[i] - prices[i - 1];
}
}
return profit;
}
Find maximum profit with at most K transactions.
// Dynamic Programming approach
// dp[i][j] = max profit using i transactions
// up to day j
public int maxProfit(int k, int[] prices) {
int n = prices.length;
if (k >= n / 2) {
return unlimitedTransactions(prices);
}
int[][] dp = new int[k + 1][n];
// ... implementation
}
Cannot buy the day after selling.
// State machine approach:
// hold = max profit while holding stock
// sold = max profit just sold
// rest = max profit in cooldown
public int maxProfit(int[] prices) {
int hold = Integer.MIN_VALUE;
int sold = 0, rest = 0;
for (int price : prices) {
int prevSold = sold;
sold = hold + price;
hold = Math.max(hold, rest - price);
rest = Math.max(rest, prevSold);
}
return Math.max(sold, rest);
}
Data Structure Requirements
Build custom data structures optimized for stock trading operations. Each structure should have documented methods, time complexity, and test cases.
StockSpan Calculator
Calculate consecutive days where price was ≤ current day's price.
class StockSpan {
private Stack<int[]> stack; // [price, span]
public int next(int price) {
int span = 1;
while (!stack.isEmpty() &&
stack.peek()[0] <= price) {
span += stack.pop()[1];
}
stack.push(new int[]{price, span});
return span;
}
}
Stock Price Tracker
Track current, min, and max prices efficiently.
class StockPriceTracker {
private Map<Integer, Integer> prices;
private TreeMap<Integer, Integer> priceCounts;
private int latestTime;
public void update(int time, int price) {...}
public int current() {...}
public int maximum() {...}
public int minimum() {...}
}
Portfolio Manager
Manage stock holdings with O(1) lookups.
class Portfolio {
private Map<String, Holding> holdings;
public void buy(String symbol, int qty, double price);
public void sell(String symbol, int qty, double price);
public double getValue(Map<String, Double> prices);
public double getTotalProfit();
public List<Holding> getHoldings();
}
Top K Stocks
Track top K performing stocks in real-time.
class TopKStocks {
private PriorityQueue<Stock> minHeap;
private int k;
public void update(String symbol, double gain);
public List<Stock> getTopK();
public boolean isTopK(String symbol);
}
Submission Requirements
Create a public GitHub repository with the exact name shown below:
Required Repository Name
stock-trading-system
Required Project Structure
stock-trading-system/
├── src/
│ ├── main/java/com/trading/
│ │ ├── algorithms/
│ │ │ ├── SingleTransaction.java
│ │ │ ├── UnlimitedTransactions.java
│ │ │ ├── KTransactions.java
│ │ │ ├── WithCooldown.java
│ │ │ └── WithTransactionFee.java
│ │ ├── datastructures/
│ │ │ ├── StockSpan.java
│ │ │ ├── StockPriceTracker.java
│ │ │ ├── Portfolio.java
│ │ │ └── TopKStocks.java
│ │ ├── models/
│ │ │ ├── Stock.java
│ │ │ └── Holding.java
│ │ └── Main.java
│ └── test/java/com/trading/
│ ├── algorithms/
│ └── datastructures/
├── data/
│ ├── stock_prices.csv
│ ├── stock_metadata.csv
│ └── sample_portfolios.csv
├── docs/
│ ├── algorithm_analysis.md
│ └── complexity_comparison.md
└── README.md
README.md Required Sections
1. Project Header
- Project title and description
- Your full name and submission date
- Course and project number
2. Problem Descriptions
- List all trading problems solved
- LeetCode problem references
- Difficulty levels
3. Implementation Details
- Approach for each algorithm
- Time and space complexity
- Key insights and optimizations
4. Data Structures
- Description of each custom structure
- Methods and their complexities
- Use cases in trading context
5. Testing
- How to run tests
- Test coverage summary
- Edge cases handled
6. Results
- Performance on provided dataset
- Sample outputs
- Profit calculations
Enter your GitHub username - we will verify your repository automatically
Grading Rubric
Your project will be graded on the following criteria. Total: 500 points.
| Criteria | Points | Description |
|---|---|---|
| Trading Algorithms | 150 | All 5 algorithms implemented correctly with optimal complexity |
| Data Structures | 100 | Custom structures with proper encapsulation and O(1)/O(log n) operations |
| Code Quality | 75 | Clean code, proper naming, SOLID principles, no code smells |
| Testing | 75 | Comprehensive test coverage, edge cases, performance tests |
| Documentation | 50 | README quality, inline comments, complexity analysis |
| Project Structure | 25 | Proper package organization, file naming, folder structure |
| Bonus Features | 25 | Additional algorithms, visualizations, performance optimizations |
| Total | 500 |
Grading Levels
Excellent
Exceeds all requirements with exceptional quality
Good
Meets all requirements with good quality
Satisfactory
Meets minimum requirements
Needs Work
Missing key requirements
Ready to Submit?
Make sure you have completed all requirements and reviewed the grading rubric above.
Submit Your Project