E2E Display Testing with Claude Code SDK
Overview
Oboyu now includes comprehensive end-to-end (E2E) display testing using the Claude Code SDK. This testing framework automatically verifies that CLI output formatting, progress displays, and user interfaces are working correctly from a user's perspective.
Features
The E2E display testing framework covers:
- CLI Output Formatting: Verifies help text, version info, and command outputs
- Progress Display: Tests HierarchicalLogger and progress bar rendering
- Search Results: Validates text and JSON output formats, snippet highlighting
- Japanese Text: Ensures proper rendering of Japanese characters
- Error Messages: Checks that errors are clear and user-friendly
- Interactive Mode: Tests the interactive query interface
Prerequisites
-
Claude Code SDK: Install the SDK globally:
npm install -g @anthropic-ai/claude-code
-
API Key: Set your Anthropic API key (optional - will prompt if needed):
export ANTHROPIC_API_KEY=your_api_key_here
-
Oboyu: Ensure Oboyu is installed and accessible:
uv sync
Running Tests
Quick Start
Run all E2E display tests:
uv run python e2e/run_tests.py
Running Specific Tests
You can run individual test categories:
# Run only basic CLI display tests
uv run python e2e/run_tests.py --test basic
# Run only search result display tests
uv run python e2e/run_tests.py --test search
# Available test options:
# - basic: Basic CLI commands (help, version, health)
# - indexing: Indexing progress display
# - search: Search result formatting
# - error: Error message display
Advanced Options
# Use a custom oboyu command path
uv run python e2e/run_tests.py --oboyu-path /path/to/oboyu
# Save report to a different location
uv run python e2e/run_tests.py --report my_report.md
# Keep test data after running (for debugging)
uv run python e2e/run_tests.py --no-cleanup
Test Implementation
The E2E tests are implemented as standalone Python scripts in the e2e/
directory. The main test runner is run_tests.py
which uses the OboyuE2EDisplayTester
class from display_tester.py
.
Test Structure
Test Categories
-
Basic CLI Display (
test_basic_cli_display
)- Tests
--help
output formatting - Verifies version display
- Validates JSON output format
- Tests
-
Indexing Progress Display (
test_indexing_progress_display
)- Creates temporary test files
- Monitors HierarchicalLogger output
- Checks progress bars and completion messages
-
Search Result Display (
test_search_result_display
)- Tests both text and JSON output formats
- Verifies snippet highlighting
- Checks Japanese text rendering
-
Error Display (
test_error_display
)- Tests various error scenarios
- Verifies error message clarity
- Checks formatting consistency
Test Implementation
The main test class OboyuE2EDisplayTester
uses Claude Code SDK in headless mode:
def run_claude_check(self, prompt: str) -> dict[str, Any]:
"""Execute Claude Code to check display issues."""
# Runs claude with -p flag for non-interactive mode
# Uses --output-format json for structured results
Test Reports
After running tests, a comprehensive Markdown report is generated containing:
- Test results for each category
- Specific issues identified
- UI/UX improvement suggestions
- Metadata (API costs, execution time)
Example report structure:
# Oboyu E2E Display Test Report
## Summary
- Total tests run: 6
- Test environment: /path/to/oboyu
- Oboyu command: oboyu
## Test Results
### Basic CLI Display
[Claude Code's analysis of CLI display quality]
### Indexing Progress Display
[Analysis of progress display functionality]
...
## Metadata
- Total cost: $0.0234
- Total duration: 15234ms
- Total turns: 12
Integration with CI/CD
You can integrate E2E display testing into your CI/CD pipeline:
# Example GitHub Actions workflow
- name: Run E2E Display Tests
env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
run: |
npm install -g @anthropic-ai/claude-code
uv run python e2e/run_tests.py --test basic
Cost Considerations
- Each test run consumes API tokens based on the complexity of checks
- The framework reports total API costs in the test report
- Consider running subset of tests during development
- Use comprehensive tests for release validation
Troubleshooting
Common Issues
-
Claude Code not found
- Ensure Claude Code SDK is installed globally
- Check PATH includes npm global bin directory
-
API Key errors
- Verify ANTHROPIC_API_KEY is set correctly
- Check API key has sufficient credits
-
Test failures
- Review the generated report for specific issues
- Use
--no-cleanup
to inspect test data - Check oboyu command path is correct
Debug Mode
For debugging, you can run the test module directly:
# Run with Python directly for debugging
cd e2e
uv run python run_tests.py --test basic --no-cleanup
Best Practices
- Regular Testing: Run E2E display tests before releases
- Selective Testing: Use specific test categories during development
- Report Review: Always review generated reports for UI/UX insights
- Cost Management: Monitor API costs and adjust test frequency accordingly
- Continuous Improvement: Update tests when adding new display features