Introduction
Mr Ninja is a Large Context Wrapper Agent that solves GitLab Duo's ~200,000 token context limit by intelligently chunking and processing oversized merge requests.
What does it do?
When analyzing large merge requests (500+ files, 800k+ tokens), Mr Ninja:
- Estimates the total token footprint
- Classifies files by priority (security-critical first)
- Chunks files into ~70k-token groups
- Routes each chunk to specialist agents (Security, Code Review, Dependencies)
- Maintains context across chunks with compact summaries
- Aggregates and deduplicates findings into a unified report
GitLab Duo can't natively handle MRs exceeding 200k tokens. Mr Ninja works around this by breaking large MRs into chunks, processing them sequentially, and reassembling the results—all while preserving critical context.
Installation
Prerequisites
- Python 3.11 or higher
- pip package manager
- (Optional) Docker for containerized deployment
Install from Source
git clone https://gitlab.com/namdpran8/mr-ninja.git
cd mr-ninja
pip install -r requirements.txt
Docker Installation
docker pull registry.gitlab.com/your-group/mr-ninja:latest
docker run -p 8000:8000 -e GITLAB_TOKEN=glpat-xxx registry.gitlab.com/your-group/mr-ninja
Verify Installation
python -m pytest tests/ -v
Quick Start
Run the Demo (No GitLab Required)
The fastest way to see Mr Ninja in action:
python -m demo.simulate_large_mr --files 512
This generates a synthetic 512-file MR and runs the full analysis pipeline.
Analyze a Real GitLab MR
export GITLAB_TOKEN="glpat-xxxxxxxxxxxxxxxxxxxx"
python -c "
from agents.orchestrator import Orchestrator
orchestrator = Orchestrator(
gitlab_url='https://gitlab.com',
gitlab_token='$GITLAB_TOKEN',
post_comments=True,
)
report = orchestrator.analyze_mr('your-group/your-project', 42)
print(f'Risk: {report.overall_risk}, Findings: {len(report.findings)}')
"
Start the API Server
uvicorn app:app --host 0.0.0.0 --port 8000
# Then analyze via HTTP
curl -X POST http://localhost:8000/analyze \
-H "Content-Type: application/json" \
-d '{"mr_url": "https://gitlab.com/group/project/-/merge_requests/42", "gitlab_token": "glpat-xxx"}'
Configuration
Environment Variables
| Variable | Description | Default |
|---|---|---|
GITLAB_TOKEN |
GitLab personal access token (required for real MR analysis) | None |
GITLAB_URL |
GitLab instance URL | https://gitlab.com |
MAX_CHUNK_TOKENS |
Target tokens per chunk | 70000 |
CHUNK_THRESHOLD |
Token count that triggers chunking | 150000 |
Orchestrator Options
orchestrator = Orchestrator(
gitlab_url='https://gitlab.com',
gitlab_token='glpat-xxx',
post_comments=True, # Post report as MR comment
max_chunk_tokens=70000, # Target chunk size
chunk_threshold=150000, # When to activate chunking
skip_generated=True, # Skip lock files, dist/, etc.
)
Architecture
System Overview
Mr Ninja is built as a modular pipeline with five core components:
1. Token Estimator
Estimates token count using len(text) / 4 heuristic with content-type multipliers.
2. Chunk Planner
Classifies files into priority tiers (P1-P6), sorts, and bin-packs into ~70k-token chunks.
3. Chunk Processor
Routes chunks to specialist agents (Security, Code Review, Dependencies) based on composition.
4. Summarizer
Generates compact cross-chunk context summaries with critical findings and open questions.
5. Aggregator
Deduplicates findings, ranks by severity, calculates risk score, and generates Markdown report.
Data Flow
MR Request
|
v
Orchestrator
|
+---> Token Estimator --> Check threshold
|
+---> Chunk Planner --> ChunkPlan (files grouped by priority)
|
+---> For each chunk:
|
+---> Chunk Processor --> Specialist Agents
|
+---> Summarizer --> ChunkSummary (context carryover)
|
+---> Aggregator --> Final Report
|
v
Post to GitLab MR
Chunking Strategy
Mr Ninja uses a greedy bin-packing algorithm to create optimally-sized chunks:
Algorithm
- Classify all files into priority tiers (P1-P6)
- Skip P6 files (generated/lock files)
- Sort remaining files by (priority, path) for deterministic ordering
- Iterate files, adding to current chunk until target exceeded
- Oversized files (>target_tokens) get their own chunk
- Assign each chunk a recommended specialist agent
Chunk Sizing
- Target: 70,000 tokens per chunk
- Range: 40,000 -- 100,000 tokens
- Hard cap: Never exceed 100,000 tokens
Priority Tiers
Files are classified into 6 priority tiers to ensure critical files are analyzed first:
| Tier | Category | Examples | Order |
|---|---|---|---|
| P1 | Security-critical | .env, Dockerfile, *.tf, auth/*, *.pem, *.key |
First |
| P2 | Entry points | main.*, app.*, routes/*, api/*, server.* |
Second |
| P3 | Changed files | All other source files with diff hunks | Third |
| P4 | Shared modules | Files imported by multiple changed files | Fourth |
| P5 | Test files | tests/*, *_test.*, *.spec.*, conftest.py |
Last |
| P6 | Generated/lock | package-lock.json, *.min.js, dist/*, node_modules/* |
Skipped |
Specialist Agents
Each chunk is routed to one or more specialist agents based on its file composition:
🔒 Security Analyst
Triggers: P1 security-critical files
Detects:
- Hardcoded secrets and credentials
- SQL injection vulnerabilities
- XSS (innerHTML, dangerouslySetInnerHTML)
- Unsafe eval()/exec() usage
- Shell injection (subprocess with shell=True)
- SSL verification disabled
- Private keys in source code
- Unsafe deserialization
⚙️ Code Reviewer
Triggers: P2-P4 logic files
Detects:
- Bare except/catch clauses
- Debug print/console.log statements
- TODO/FIXME comments
- Global/nonlocal variable usage
- Long sleep/delay calls
- Code duplication
- Complexity issues
📦 Dependency Analyzer
Triggers: Package manifests
Detects:
- Wildcard version specifiers (
"*") - Overly broad version ranges (
>=0.) - Deprecated packages (lodash, moment, request)
- Known security vulnerabilities
- Missing lockfiles
Cross-Chunk Context
To maintain continuity across chunks, Mr Ninja generates compact summaries after each chunk:
Context Structure
=== CROSS-CHUNK CONTEXT (read-only) ===
Chunks completed: 2/6
Files analyzed: 85
Critical/High findings: 3
[CRITICAL] auth/handler.py:12 -- Hardcoded secret
[HIGH] payments/service.py:45 -- Unsafe eval()
Open questions:
- Verify import 'auth.validator' -- not found in this chunk
Key exports: auth.py:AuthService, utils.py:validate
=== END CONTEXT ===
Carryover Rules
- CRITICAL and HIGH findings are always carried forward
- MEDIUM findings are carried if space allows
- LOW/INFO findings are dropped after 2 chunks
- Open questions persist until explicitly resolved
- Maximum context overhead: 2,000 tokens
Command Line Usage
Demo Simulation
# Basic demo (512 files)
python -m demo.simulate_large_mr
# Custom file count
python -m demo.simulate_large_mr --files 1000
# Save report
python -m demo.simulate_large_mr --output report.md
Generate Test Repository
python -m demo.generate_large_repo --output-dir ./sample_repo --files 512
Run Tests
# All tests
python -m pytest tests/ -v
# With coverage
python -m pytest tests/ --cov=core --cov=agents --cov-report=html
Python API
Orchestrator
from agents.orchestrator import Orchestrator
orchestrator = Orchestrator(
gitlab_url='https://gitlab.com',
gitlab_token='glpat-xxx',
post_comments=True,
)
# Analyze MR
report = orchestrator.analyze_mr(
project='your-group/your-project',
mr_iid=42,
)
# Access results
print(f'Risk: {report.overall_risk}')
print(f'Findings: {len(report.findings)}')
for finding in report.findings:
print(f'[{finding.severity}] {finding.file}:{finding.line} -- {finding.title}')
Chunk Planner
from agents.chunk_planner import ChunkPlanner
from gitlab.gitlab_client import GitLabClient
client = GitLabClient(url='https://gitlab.com', token='glpat-xxx')
planner = ChunkPlanner(max_chunk_tokens=70000)
# Get MR diffs
mr_data = client.get_mr_diff('your-group/your-project', 42)
# Plan chunks
chunk_plan = planner.plan(mr_data)
print(f'Created {len(chunk_plan.chunks)} chunks')
for i, chunk in enumerate(chunk_plan.chunks):
print(f'Chunk {i+1}: {chunk.file_count} files, ~{chunk.estimated_tokens} tokens')
REST API
Endpoints
Health check endpoint
{"status": "ok", "version": "1.0.0"}
Analyze a GitLab MR
{
"mr_url": "https://gitlab.com/group/project/-/merge_requests/42",
"gitlab_token": "glpat-xxxxxxxxxxxxxxxxxxxx",
"max_chunk_tokens": 70000,
"post_comment": true
}
Response:
{
"status": "ok",
"mr_id": "42",
"chunks_processed": 6,
"total_findings": 45,
"critical_findings": 8,
"overall_risk": "CRITICAL",
"report_markdown": "# Mr Ninja Analysis Report\n...",
"processing_time_seconds": 2.3
}
Run demo simulation (no GitLab token required)
{"files": 512}
CI/CD Integration
GitLab CI Example
mr-ninja:analyze:
stage: test
image: python:3.11-slim
before_script:
- pip install -r requirements.txt
script:
- |
python3 -c "
import os
from agents.orchestrator import Orchestrator
orchestrator = Orchestrator(
gitlab_url=os.environ['CI_SERVER_URL'],
gitlab_token=os.environ['GITLAB_TOKEN'],
post_comments=True,
)
report = orchestrator.analyze_mr(
os.environ['CI_PROJECT_PATH'],
int(os.environ['CI_MERGE_REQUEST_IID']),
)
print(f'Analysis: {len(report.findings)} findings')
if report.overall_risk in ['CRITICAL', 'HIGH']:
exit(1) # Fail pipeline
"
rules:
- if: '$CI_MERGE_REQUEST_IID && $GITLAB_TOKEN'
allow_failure: true
GitHub Actions Example
name: Mr Ninja Analysis
on: [pull_request]
jobs:
analyze:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: actions/setup-python@v4
with:
python-version: '3.11'
- run: pip install -r requirements.txt
- run: |
python -m demo.simulate_large_mr --output report.md
- uses: actions/upload-artifact@v3
with:
name: mr-ninja-report
path: report.md
API Reference
Classes
Orchestrator
Main orchestrator class that coordinates the analysis pipeline.
class Orchestrator:
def __init__(
self,
gitlab_url: str,
gitlab_token: str,
post_comments: bool = False,
max_chunk_tokens: int = 70000,
chunk_threshold: int = 150000,
):
...
def analyze_mr(
self,
project: str,
mr_iid: int,
) -> AnalysisReport:
"""Analyze a GitLab merge request."""
...
ChunkPlanner
Plans chunk boundaries based on file priorities and token counts.
class ChunkPlanner:
def plan(self, mr_data: MRData) -> ChunkPlan:
"""Generate chunk plan for an MR."""
...
ChunkProcessor
Processes individual chunks through specialist agents.
class ChunkProcessor:
def process_chunk(
self,
chunk: Chunk,
cross_chunk_context: str,
) -> ChunkSummary:
"""Process a single chunk."""
...
Configuration Options
| Option | Type | Default | Description |
|---|---|---|---|
max_chunk_tokens |
int | 70000 | Target tokens per chunk |
chunk_threshold |
int | 150000 | Token count that triggers chunking |
post_comments |
bool | False | Post analysis as MR comment |
skip_generated |
bool | True | Skip P6 generated files |
verbose |
bool | False | Enable verbose logging |
Finding Types
Severity Levels
- CRITICAL -- Immediate security vulnerabilities (hardcoded secrets, SQL injection)
- HIGH -- Serious issues (unsafe eval, shell injection, SSL bypass)
- MEDIUM -- Code quality issues (TODO comments, bare exceptions)
- LOW -- Minor issues (debug prints, style issues)
- INFO -- Informational notices
Risk Scoring
Overall risk is calculated as:
risk_score = sum([
10 * count(CRITICAL),
5 * count(HIGH),
2 * count(MEDIUM),
1 * count(LOW),
])
max_risk = 100
Overall risk level is determined by the worst finding's severity.