AdBlock Compiler Documentation

Welcome to the AdBlock Compiler documentation. This directory contains all the detailed documentation for the project.

Documentation Structure

docs/
├── api/             # REST API reference, OpenAPI spec, streaming, and validation
├── cloudflare/      # Cloudflare-specific features (Queues, D1, Workflows, Analytics)
├── database-setup/  # Database architecture, PostgreSQL, Prisma, and local dev setup
├── deployment/      # Docker, Cloudflare Pages/Containers, and production readiness
├── development/     # Architecture, extensibility, diagnostics, and code quality
├── frontend/        # Angular SPA, Vite, Tailwind CSS, and UI components
├── guides/          # Getting started, migration, client libraries, and troubleshooting
├── postman/         # Postman collection and environment files
├── reference/       # Version management, environment config, and project reference
├── releases/        # Release notes and announcements
├── testing/         # Testing guides, E2E, and Postman API testing
└── workflows/       # GitHub Actions CI/CD workflows and automation

Getting Started

Usage

API Reference

Cloudflare Worker

Deployment

Storage & Database

Frontend Development

Development

Testing

CI/CD & Workflows

Reference

Releases


Contributing

See the main README and CONTRIBUTING for information on how to contribute to this project.

API Reference

The full TypeScript API reference is automatically generated from the JSDoc annotations embedded in the src/ source files using deno doc --html.

Browsing the reference

Tip: The API reference is a separate static site generated alongside this book. Click the button below (or the sidebar link) to open it.

Note: The api-reference/index.html link above is only available after running deno task docs:api (to generate just the API reference) or deno task docs:build (to build the full site) locally or in a deployed mdBook site. It is not present in the repository source tree.

What is documented

Every symbol exported from the library's main entry point (src/index.ts) is covered, including:

CategoryKey exports
CompilerFilterCompiler, SourceCompiler, IncrementalCompiler, compile()
TransformationsRemoveCommentsTransformation, DeduplicateTransformation, CompressTransformation, ValidateTransformation, …
PlatformWorkerCompiler, HttpFetcher, CompositeFetcher, PlatformDownloader
FormattersAdblockFormatter, HostsFormatter, DnsmasqFormatter, JsonFormatter, …
ServicesFilterService, ASTViewerService, AnalyticsService
DiagnosticsDiagnosticsCollector, createTracingContext, traceAsync, traceSync
UtilsRuleUtils, Logger, CircuitBreaker, CompilerEventEmitter, …
ConfigurationConfigurationSchema, ConfigurationValidator, all Zod schemas
TypesAll public interfaces (IConfiguration, ILogger, ICompilerEvents, …)
DiffDiffGenerator, generateDiff
PluginsPluginRegistry, PluginTransformationWrapper

Regenerating locally

# Generate the HTML API reference into book/api-reference/
deno task docs:api

# Build the full mdBook site + API reference in one step
deno task docs:build

# Live-preview the mdBook (does not include API reference)
deno task docs:serve

JSDoc conventions

All public classes, interfaces, methods, and enum values are documented with JSDoc comments following the project's conventions:

/**
 * Brief one-line description.
 *
 * Longer explanation of behaviour, constraints, or design decisions.
 *
 * @param inputRules - The raw rule strings to process.
 * @returns The transformed rule strings.
 * @example
 * ```ts
 * const result = new DeduplicateTransformation().executeSync(rules);
 * ```
 */

See docs/development/CODE_REVIEW.md for the full documentation style guide.

Adblock Compiler API

Version: 2.0.0

Description

Compiler-as-a-Service for adblock filter lists. Transform, optimize, and combine filter lists from multiple sources with real-time progress tracking.

Features

  • 🎯 Multi-Source Compilation
  • ⚡ Performance (Gzip compression, caching, request deduplication)
  • 🔄 Circuit Breaker with retry logic
  • 📊 Visual Diff between compilations
  • 📡 Real-time progress via SSE and WebSocket
  • 🎪 Batch Processing
  • 🌍 Universal (Deno, Node.js, Cloudflare Workers, browsers)

Servers

  • Production server: https://adblock-compiler.jayson-knight.workers.dev
  • Local development server: http://localhost:8787

Endpoints

Metrics

GET /api

Summary: Get API information

Returns API version, available endpoints, and usage examples

Operation ID: getApiInfo

Responses:

  • 200: API information

GET /metrics

Summary: Get performance metrics

Returns aggregated metrics for the last 30 minutes

Operation ID: getMetrics

Responses:

  • 200: Performance metrics

Compilation

POST /compile

Summary: Compile filter list (JSON)

Compile filter lists and return results as JSON. Results are cached for 1 hour. Supports request deduplication for concurrent identical requests.

Operation ID: compileJson

Request Body:

Responses:

  • 200: Compilation successful
  • 429: No description
  • 500: No description

POST /compile/batch

Summary: Batch compile multiple lists

Compile multiple filter lists in parallel (max 10 per batch)

Operation ID: compileBatch

Request Body:

Responses:

  • 200: Batch compilation results
  • 400: Invalid batch request
  • 429: No description

Streaming

POST /compile/stream

Summary: Compile with real-time progress (SSE)

Compile filter lists with real-time progress updates via Server-Sent Events. Streams events including source downloads, transformations, diagnostics, cache operations, network events, and metrics.

Operation ID: compileStream

Request Body:

Responses:

  • 200: Event stream
  • 429: No description

Queue

POST /compile/async

Summary: Queue async compilation job

Queue a compilation job for asynchronous processing. Returns immediately with a request ID. Use GET /queue/results/{requestId} to retrieve results when complete.

Operation ID: compileAsync

Request Body:

Responses:

  • 202: Job queued successfully
  • 500: Queue not available

POST /compile/batch/async

Summary: Queue batch async compilation

Queue multiple compilations for async processing

Operation ID: compileBatchAsync

Request Body:

Responses:

  • 202: Batch queued successfully

GET /queue/stats

Summary: Get queue statistics

Returns queue health metrics and job statistics

Operation ID: getQueueStats

Responses:

  • 200: Queue statistics

GET /queue/results/{requestId}

Summary: Get async job results

Retrieve results for a completed async compilation job

Operation ID: getQueueResults

Parameters:

  • requestId (path) (required): Request ID returned from async endpoints

Responses:

  • 200: Job results
  • 404: Job not found

WebSocket

GET /ws/compile

Summary: WebSocket endpoint for real-time compilation

Bidirectional WebSocket connection for real-time compilation with event streaming.

Client → Server Messages:

  • compile - Start compilation
  • cancel - Cancel running compilation
  • ping - Heartbeat ping

Server → Client Messages:

  • welcome - Connection established
  • pong - Heartbeat response
  • compile:started - Compilation started
  • event - Compilation event (source, transformation, progress, diagnostic, cache, network, metric)
  • compile:complete - Compilation finished successfully
  • compile:error - Compilation failed
  • compile:cancelled - Compilation cancelled
  • error - Error message

Features:

  • Up to 3 concurrent compilations per connection
  • Automatic heartbeat (30s interval)
  • Connection timeout (5 minutes idle)
  • Session-based compilation tracking
  • Cancellation support

Operation ID: websocketCompile

Responses:

  • 101: WebSocket connection established
  • 426: Upgrade required (not a WebSocket request)

Schemas

CompileRequest

Properties:

  • configuration (required): Configuration -
  • preFetchedContent: object - Map of source keys to pre-fetched content
  • benchmark: boolean - Include detailed performance metrics
  • turnstileToken: string - Cloudflare Turnstile token (if enabled)

Configuration

Properties:

  • name (required): string - Name of the compiled list
  • description: string - Description of the list
  • homepage: string - Homepage URL
  • license: string - License identifier
  • version: string - Version string
  • sources (required): array -
  • transformations: array - Global transformations to apply
  • exclusions: array - Rules to exclude (supports wildcards and regex)
  • exclusions_sources: array - Files containing exclusion rules
  • inclusions: array - Rules to include (supports wildcards and regex)
  • inclusions_sources: array - Files containing inclusion rules

Source

Properties:

  • source (required): string - URL or key for pre-fetched content
  • name: string - Name of the source
  • type: string - Source type
  • transformations: array -
  • exclusions: array -
  • inclusions: array -

Transformation

Available transformations (applied in this order):

  • ConvertToAscii: Convert internationalized domains to ASCII
  • RemoveComments: Remove comment lines
  • Compress: Convert hosts format to adblock syntax
  • RemoveModifiers: Strip unsupported modifiers
  • Validate: Remove invalid/dangerous rules
  • ValidateAllowIp: Like Validate but keeps IP addresses
  • Deduplicate: Remove duplicate rules
  • InvertAllow: Convert blocking rules to allowlist
  • RemoveEmptyLines: Remove blank lines
  • TrimLines: Remove leading/trailing whitespace
  • InsertFinalNewLine: Add final newline

Enum values:

  • ConvertToAscii
  • RemoveComments
  • Compress
  • RemoveModifiers
  • Validate
  • ValidateAllowIp
  • Deduplicate
  • InvertAllow
  • RemoveEmptyLines
  • TrimLines
  • InsertFinalNewLine

BatchCompileRequest

Properties:

  • requests (required): array -

BatchRequestItem

Properties:

  • id (required): string - Unique request identifier
  • configuration (required): Configuration -
  • preFetchedContent: object -
  • benchmark: boolean -

CompileResponse

Properties:

  • success (required): boolean -
  • rules: array - Compiled filter rules
  • ruleCount: integer - Number of rules
  • metrics: CompilationMetrics -
  • compiledAt: string -
  • previousVersion: PreviousVersion -
  • cached: boolean - Whether result was served from cache
  • deduplicated: boolean - Whether request was deduplicated
  • error: string - Error message if success=false

CompilationMetrics

Properties:

  • totalDurationMs: integer -
  • sourceCount: integer -
  • ruleCount: integer -
  • transformationMetrics: array -

PreviousVersion

Properties:

  • rules: array -
  • ruleCount: integer -
  • compiledAt: string -

BatchCompileResponse

Properties:

  • success: boolean -
  • results: array -

QueueResponse

Properties:

  • success: boolean -
  • message: string -
  • requestId: string -
  • priority: string -

QueueJobStatus

Properties:

  • success: boolean -
  • status: string -
  • jobInfo: object -

QueueStats

Properties:

  • pending: integer -
  • completed: integer -
  • failed: integer -
  • cancelled: integer -
  • totalProcessingTime: integer -
  • averageProcessingTime: integer -
  • processingRate: number - Jobs per minute
  • queueLag: integer - Average time in queue (ms)
  • lastUpdate: string -
  • history: array -
  • depthHistory: array -

JobHistoryEntry

Properties:

  • requestId: string -
  • configName: string -
  • status: string -
  • duration: integer -
  • timestamp: string -
  • error: string -
  • ruleCount: integer -

MetricsResponse

Properties:

  • window: string -
  • timestamp: string -
  • endpoints: object -

ApiInfo

Properties:

  • name: string -
  • version: string -
  • endpoints: object -
  • example: object -

WsCompileRequest

Properties:

  • type (required): string -
  • sessionId (required): string -
  • configuration (required): Configuration -
  • preFetchedContent: object -
  • benchmark: boolean -

WsCancelRequest

Properties:

  • type (required): string -
  • sessionId (required): string -

WsPingMessage

Properties:

  • type (required): string -

WsWelcomeMessage

Properties:

  • type (required): string -
  • version (required): string -
  • connectionId (required): string -
  • capabilities (required): object -

WsPongMessage

Properties:

  • type (required): string -
  • timestamp: string -

WsCompileStartedMessage

Properties:

  • type (required): string -
  • sessionId (required): string -
  • configurationName (required): string -

WsEventMessage

Properties:

  • type (required): string -
  • sessionId (required): string -
  • eventType (required): string -
  • data (required): object -

WsCompileCompleteMessage

Properties:

  • type (required): string -
  • sessionId (required): string -
  • rules (required): array -
  • ruleCount (required): integer -
  • metrics: object -
  • compiledAt: string -

WsCompileErrorMessage

Properties:

  • type (required): string -
  • sessionId (required): string -
  • error (required): string -
  • details: object -


Additional API Documentation

AGTree Integration

This document describes the integration of @adguard/agtree into the adblock-compiler project.

Overview

AGTree is AdGuard's official tool set for working with adblock filter lists. It provides:

  • Adblock rule parser - Parses rules into Abstract Syntax Trees (AST)
  • Rule converter - Converts rules between different adblock syntaxes
  • Rule validator - Validates rules against known modifier definitions
  • Compatibility tables - Maps modifiers/features across different ad blockers

Why AGTree?

Before AGTree

The compiler used custom regex-based parsing in RuleUtils.ts:

  • Limited to basic pattern matching
  • No formal grammar or AST representation
  • Manual modifier validation
  • No syntax detection for different ad blockers
  • Prone to edge-case parsing errors

After AGTree

FeatureBeforeAfter
Rule ParsingCustom regexFull AST with location info
Syntax SupportBasic adblockAdGuard, uBlock Origin, Adblock Plus
Modifier ValidationHardcoded listCompatibility tables
Error HandlingString matchingStructured errors with positions
Rule TypesNetwork + hostsAll cosmetic, network, comments
MaintainabilityManual updatesUpstream library updates

Architecture

Module Structure

src/utils/
├── AGTreeParser.ts    # Wrapper module for AGTree
├── RuleUtils.ts       # Refactored to use AGTreeParser
└── index.ts           # Exports AGTreeParser types

AGTreeParser Wrapper

The AGTreeParser class provides a simplified interface to AGTree:

import { AGTreeParser } from '@/utils/AGTreeParser.ts';

// Parse a single rule
const result = AGTreeParser.parse('||example.com^$third-party');
if (result.success && AGTreeParser.isNetworkRule(result.ast!)) {
    const props = AGTreeParser.extractNetworkRuleProperties(result.ast);
    console.log(props.pattern);    // '||example.com^'
    console.log(props.modifiers);  // [{ name: 'third-party', value: null, exception: false }]
}

// Parse an entire filter list
const filterList = AGTreeParser.parseFilterList(rawFilterListText);
for (const rule of filterList.children) {
    if (AGTreeParser.isNetworkRule(rule)) {
        // Process network rule
    }
}

// Detect syntax
const syntax = AGTreeParser.detectSyntax('example.com##+js(aopr, ads)');
// Returns: AdblockSyntax.Ubo

Key Features

1. Type Guards

AGTreeParser provides comprehensive type guards for all rule types:

AGTreeParser.isEmpty(rule)           // Empty lines
AGTreeParser.isComment(rule)         // All comment types
AGTreeParser.isSimpleComment(rule)   // ! or # comments
AGTreeParser.isMetadataComment(rule) // ! Title: ...
AGTreeParser.isHintComment(rule)     // !+ NOT_OPTIMIZED
AGTreeParser.isPreProcessorComment(rule) // !#if, !#include
AGTreeParser.isNetworkRule(rule)     // ||domain^ style
AGTreeParser.isHostRule(rule)        // /etc/hosts style
AGTreeParser.isCosmeticRule(rule)    // ##, #@#, etc.
AGTreeParser.isElementHidingRule(rule)
AGTreeParser.isCssInjectionRule(rule)
AGTreeParser.isScriptletRule(rule)
AGTreeParser.isExceptionRule(rule)   // @@ or #@# rules

2. Property Extraction

Extract structured data from parsed rules:

// Network rules
const props = AGTreeParser.extractNetworkRuleProperties(networkRule);
// Returns: { pattern, isException, modifiers, syntax, ruleText }

// Host rules
const hostProps = AGTreeParser.extractHostRuleProperties(hostRule);
// Returns: { ip, hostnames, comment, ruleText }

// Cosmetic rules
const cosmeticProps = AGTreeParser.extractCosmeticRuleProperties(cosmeticRule);
// Returns: { domains, separator, isException, body, type, syntax, ruleText }

3. Modifier Utilities

Work with network rule modifiers:

// Find a specific modifier
const mod = AGTreeParser.findModifier(rule, 'domain');

// Check if modifier exists
const hasThirdParty = AGTreeParser.hasModifier(rule, 'third-party');

// Get modifier value
const domainValue = AGTreeParser.getModifierValue(rule, 'domain');
// Returns: 'example.com|~example.org' or null

4. Validation

Validate rules and modifiers:

// Validate a single modifier
const result = AGTreeParser.validateModifier('important', undefined, AdblockSyntax.Adg);
// Returns: { valid: boolean, errors: string[] }

// Validate all modifiers in a network rule
const validation = AGTreeParser.validateNetworkRuleModifiers(rule);
if (!validation.valid) {
    console.log(validation.errors);
}

5. Syntax Detection

Automatically detect which ad blocker syntax a rule uses:

const syntax = AGTreeParser.detectSyntax(ruleText);
// Returns: AdblockSyntax.Adg | Ubo | Abp | Common

// Check specific syntax
AGTreeParser.isAdGuardSyntax(rule)   // AdGuard-specific
AGTreeParser.isUBlockSyntax(rule)    // uBlock Origin-specific
AGTreeParser.isAbpSyntax(rule)       // Adblock Plus-specific

Integration Points

RuleUtils

RuleUtils now uses AGTree internally while maintaining the same public API:

// These methods now use AGTree parsing internally:
RuleUtils.isComment(ruleText)
RuleUtils.isAllowRule(ruleText)
RuleUtils.isEtcHostsRule(ruleText)
RuleUtils.loadAdblockRuleProperties(ruleText)
RuleUtils.loadEtcHostsRuleProperties(ruleText)

// New AGTree-powered methods:
RuleUtils.parseToAST(ruleText)       // Get raw AST
RuleUtils.isValidRule(ruleText)      // Check parseability
RuleUtils.isNetworkRule(ruleText)    // Network rule check
RuleUtils.isCosmeticRule(ruleText)   // Cosmetic rule check
RuleUtils.detectSyntax(ruleText)     // Syntax detection

ValidateTransformation

The validation transformation uses AGTree for robust rule validation:

  • Parses rules once and reuses the AST
  • Uses structured type checking instead of regex
  • Validates modifiers against AGTree's compatibility tables
  • Properly handles all rule categories (network, host, cosmetic, comment)
  • Provides better error messages with context
// Before: String-based validation
if (RuleUtils.isEtcHostsRule(ruleText)) {
    return this.validateEtcHostsRule(ruleText);
}

// After: AST-based validation  
if (AGTreeParser.isHostRule(ast)) {
    return this.validateHostRule(ast as HostRule, ruleText);
}

Configuration

AGTree is configured in deno.json:

{
    "imports": {
        "@adguard/agtree": "npm:@adguard/agtree@^3.4.3"
    }
}

Performance Considerations

  1. Parsing Once: Parse each rule once and pass the AST to multiple validation functions
  2. Tolerant Mode: Use tolerant: true to get InvalidRule nodes instead of exceptions
  3. Include Raws: Use includeRaws: true to preserve original rule text in AST
const DEFAULT_PARSER_OPTIONS: ParserOptions = {
    parseHostRules: true,
    includeRaws: true,
    tolerant: true,
};

Error Handling

AGTree provides structured error information:

const result = AGTreeParser.parse(ruleText);

if (!result.success) {
    console.log(result.error);    // Error message
    console.log(result.ruleText); // Original rule
    
    // In tolerant mode, ast may be an InvalidRule
    if (result.ast?.category === RuleCategory.Invalid) {
        // Access error details from the InvalidRule node
    }
}

Supported Rule Types

AGTree supports parsing all major adblock rule types:

Network Rules

  • Basic blocking: ||example.com^
  • Exception: @@||example.com^
  • With modifiers: ||example.com^$third-party,script

Host Rules

  • Standard: 127.0.0.1 example.com
  • Multiple hosts: 0.0.0.0 ad1.com ad2.com
  • With comments: 127.0.0.1 example.com # block ads

Cosmetic Rules

  • Element hiding: example.com##.ad-banner
  • Extended CSS: example.com#?#.ad:has(> .text)
  • CSS injection: example.com#$#.ad { display: none !important; }
  • Scriptlet injection: example.com#%#//scriptlet('abort-on-property-read', 'ads')

Comment Rules

  • Simple: ! This is a comment
  • Metadata: ! Title: My Filter List
  • Hints: !+ NOT_OPTIMIZED PLATFORM(windows)
  • Preprocessor: !#if (adguard)

Future Improvements

  1. Rule Conversion: Use AGTree's converter to transform rules between syntaxes
  2. Batch Parsing: Use FilterListParser for bulk operations
  3. Streaming: Process large filter lists without loading all into memory
  4. Diagnostics: Leverage AGTree's location info for better error reporting

References

Batch API Guide - Visual Learning Edition

📚 A comprehensive visual guide to using the Batch Compilation API

This guide provides detailed explanations and diagrams for working with batch compilations in the adblock-compiler API. Perfect for visual learners!

Table of Contents


Overview

The Batch API allows you to compile multiple filter lists in a single request. Behind the scenes, it uses Cloudflare Queues for reliable, scalable processing.

Key Benefits

graph TB
    subgraph "Why Use Batch API?"
        A[Batch API] --> B[🚀 Parallel Processing]
        A --> C[⚡ Efficient Resource Use]
        A --> D[🔄 Automatic Retries]
        A --> E[📊 Progress Tracking]
        A --> F[💰 Cost Effective]
    end
    
    style A fill:#667eea,stroke:#333,stroke-width:3px,color:#fff
    style B fill:#10b981,stroke:#333,stroke-width:2px,color:#fff
    style C fill:#10b981,stroke:#333,stroke-width:2px,color:#fff
    style D fill:#10b981,stroke:#333,stroke-width:2px,color:#fff
    style E fill:#10b981,stroke:#333,stroke-width:2px,color:#fff
    style F fill:#10b981,stroke:#333,stroke-width:2px,color:#fff

Architecture Diagrams

High-Level System Architecture

graph LR
    subgraph "Client Layer"
        Client[👤 Your Application]
    end
    
    subgraph "API Layer"
        API[🌐 Worker API<br/>POST /compile/batch]
        AAPI[🌐 Async API<br/>POST /compile/batch/async]
    end
    
    subgraph "Processing Layer"
        Compiler[⚙️ Batch Compiler<br/>Parallel Processing]
        Queue[📬 Cloudflare Queue<br/>Message Broker]
        Consumer[🔄 Queue Consumer<br/>Background Worker]
    end
    
    subgraph "Storage Layer"
        Cache[💾 KV Cache<br/>Results Storage]
        R2[📦 R2 Storage<br/>Large Results]
    end
    
    Client -->|Sync Request| API
    Client -->|Async Request| AAPI
    
    API --> Compiler
    AAPI --> Queue
    Queue --> Consumer
    Consumer --> Compiler
    
    Compiler --> Cache
    Compiler --> R2
    Cache -.->|Cached Result| Client
    R2 -.->|Large Result| Client
    
    style Client fill:#667eea,stroke:#333,stroke-width:2px,color:#fff
    style API fill:#f59e0b,stroke:#333,stroke-width:2px,color:#fff
    style AAPI fill:#f59e0b,stroke:#333,stroke-width:2px,color:#fff
    style Compiler fill:#10b981,stroke:#333,stroke-width:2px,color:#fff
    style Queue fill:#8b5cf6,stroke:#333,stroke-width:2px,color:#fff
    style Consumer fill:#8b5cf6,stroke:#333,stroke-width:2px,color:#fff
    style Cache fill:#3b82f6,stroke:#333,stroke-width:2px,color:#fff
    style R2 fill:#3b82f6,stroke:#333,stroke-width:2px,color:#fff

Queue Processing Pipeline

graph TB
    subgraph "Input"
        REQ[📝 Batch Request<br/>Max 10 items]
    end
    
    subgraph "Validation"
        VAL{✅ Validate<br/>Request}
        ERR1[❌ Error:<br/>Too many items]
        ERR2[❌ Error:<br/>Invalid config]
    end
    
    subgraph "Queue Selection"
        PRIORITY{🎯 Priority?}
        HPQ[⚡ High Priority Queue<br/>Faster processing]
        SPQ[📋 Standard Queue<br/>Normal processing]
    end
    
    subgraph "Processing"
        BATCH[📦 Batch Messages<br/>Group by priority]
        PROCESS[⚙️ Compile Each Item<br/>Parallel execution]
    end
    
    subgraph "Storage"
        CACHE[💾 Cache Results<br/>1 hour TTL]
        METRICS[📊 Update Metrics<br/>Track performance]
    end
    
    subgraph "Output"
        RESPONSE[✅ Success Response<br/>With request ID]
        NOTIFY[🔔 Optional Webhook<br/>Completion notification]
    end
    
    REQ --> VAL
    VAL -->|Valid| PRIORITY
    VAL -->|Invalid| ERR1
    VAL -->|Bad Config| ERR2
    
    PRIORITY -->|High| HPQ
    PRIORITY -->|Standard| SPQ
    
    HPQ --> BATCH
    SPQ --> BATCH
    BATCH --> PROCESS
    PROCESS --> CACHE
    PROCESS --> METRICS
    CACHE --> RESPONSE
    METRICS --> NOTIFY
    
    style REQ fill:#667eea,stroke:#333,stroke-width:2px,color:#fff
    style VAL fill:#f59e0b,stroke:#333,stroke-width:2px,color:#000
    style PRIORITY fill:#f59e0b,stroke:#333,stroke-width:2px,color:#000
    style HPQ fill:#ef4444,stroke:#333,stroke-width:2px,color:#fff
    style SPQ fill:#10b981,stroke:#333,stroke-width:2px,color:#fff
    style BATCH fill:#8b5cf6,stroke:#333,stroke-width:2px,color:#fff
    style PROCESS fill:#10b981,stroke:#333,stroke-width:2px,color:#fff
    style CACHE fill:#3b82f6,stroke:#333,stroke-width:2px,color:#fff
    style RESPONSE fill:#10b981,stroke:#333,stroke-width:2px,color:#fff
    style ERR1 fill:#ef4444,stroke:#333,stroke-width:2px,color:#fff
    style ERR2 fill:#ef4444,stroke:#333,stroke-width:2px,color:#fff

Batch Types

Synchronous vs Asynchronous Comparison

graph TB
    subgraph "Synchronous Batch"
        SYNC_REQ[📤 POST /compile/batch]
        SYNC_WAIT[⏳ Wait for completion<br/>Max 30 seconds]
        SYNC_RESP[📥 Immediate response<br/>With all results]
        
        SYNC_REQ --> SYNC_WAIT --> SYNC_RESP
    end
    
    subgraph "Asynchronous Batch"
        ASYNC_REQ[📤 POST /compile/batch/async]
        ASYNC_ACK[⚡ Immediate acknowledgment<br/>202 Accepted]
        ASYNC_QUEUE[📬 Background processing<br/>No time limit]
        ASYNC_CHECK[🔍 GET /queue/results/:id<br/>Check status]
        ASYNC_RESP[📥 Get results when ready]
        
        ASYNC_REQ --> ASYNC_ACK
        ASYNC_ACK --> ASYNC_QUEUE
        ASYNC_QUEUE --> ASYNC_CHECK
        ASYNC_CHECK --> ASYNC_RESP
    end
    
    style SYNC_REQ fill:#f59e0b,stroke:#333,stroke-width:2px,color:#fff
    style SYNC_WAIT fill:#ef4444,stroke:#333,stroke-width:2px,color:#fff
    style SYNC_RESP fill:#10b981,stroke:#333,stroke-width:2px,color:#fff
    style ASYNC_REQ fill:#f59e0b,stroke:#333,stroke-width:2px,color:#fff
    style ASYNC_ACK fill:#10b981,stroke:#333,stroke-width:2px,color:#fff
    style ASYNC_QUEUE fill:#8b5cf6,stroke:#333,stroke-width:2px,color:#fff
    style ASYNC_CHECK fill:#3b82f6,stroke:#333,stroke-width:2px,color:#fff
    style ASYNC_RESP fill:#10b981,stroke:#333,stroke-width:2px,color:#fff

When to Use Each Type

mindmap
    root((Batch API<br/>Decision))
        Synchronous
            Small batches ≤ 3 items
            Fast filter lists
            Need immediate results
            Low complexity transformations
            User waiting for response
        Asynchronous
            Large batches 4-10 items
            Slow/large filter lists
            Can poll for results
            Complex transformations
            Background processing
            Webhook notifications

API Endpoints

Endpoint Overview

graph LR
    subgraph "Batch Endpoints"
        direction TB
        E1[📍 POST /compile/batch<br/>Synchronous]
        E2[📍 POST /compile/batch/async<br/>Asynchronous]
        E3[📍 GET /queue/results/:id<br/>Get async results]
        E4[📍 GET /queue/stats<br/>Queue statistics]
    end
    
    subgraph "Use Cases"
        direction TB
        U1[🎯 Quick batch compilation]
        U2[⏱️ Long-running compilations]
        U3[📊 Check completion status]
        U4[📈 Monitor queue health]
    end
    
    E1 -.-> U1
    E2 -.-> U2
    E3 -.-> U3
    E4 -.-> U4
    
    style E1 fill:#10b981,stroke:#333,stroke-width:2px,color:#fff
    style E2 fill:#8b5cf6,stroke:#333,stroke-width:2px,color:#fff
    style E3 fill:#3b82f6,stroke:#333,stroke-width:2px,color:#fff
    style E4 fill:#f59e0b,stroke:#333,stroke-width:2px,color:#fff
    style U1 fill:#dbeafe,stroke:#333,stroke-width:1px
    style U2 fill:#ede9fe,stroke:#333,stroke-width:1px
    style U3 fill:#dbeafe,stroke:#333,stroke-width:1px
    style U4 fill:#fef3c7,stroke:#333,stroke-width:1px

Request Structure Diagram

graph TB
    subgraph "Batch Request Structure"
        ROOT[🔷 Root Object]
        REQUESTS[📋 requests array<br/>Min: 1, Max: 10]
        
        ROOT --> REQUESTS
        
        REQUESTS --> ITEM1[Item 1]
        REQUESTS --> ITEM2[Item 2]
        REQUESTS --> ITEMN[Item N...]
        
        ITEM1 --> ID1[id: string<br/>unique identifier]
        ITEM1 --> CFG1[configuration: object<br/>compilation config]
        ITEM1 --> PRE1[preFetchedContent?: object<br/>optional pre-fetched data]
        ITEM1 --> BMK1[benchmark?: boolean<br/>enable metrics]
        
        CFG1 --> NAME[name: string<br/>list name]
        CFG1 --> SOURCES[sources: array<br/>filter list sources]
        CFG1 --> TRANS[transformations?: array<br/>processing steps]
        
        SOURCES --> SRC1[Source 1<br/>URL or key]
        SOURCES --> SRC2[Source 2<br/>URL or key]
    end
    
    style ROOT fill:#667eea,stroke:#333,stroke-width:3px,color:#fff
    style REQUESTS fill:#8b5cf6,stroke:#333,stroke-width:2px,color:#fff
    style ITEM1 fill:#10b981,stroke:#333,stroke-width:2px,color:#fff
    style ITEM2 fill:#10b981,stroke:#333,stroke-width:2px,color:#fff
    style ITEMN fill:#10b981,stroke:#333,stroke-width:2px,color:#fff
    style CFG1 fill:#3b82f6,stroke:#333,stroke-width:2px,color:#fff

Request/Response Flow

Synchronous Batch Flow (Detailed)

sequenceDiagram
    participant Client as 👤 Client
    participant API as 🌐 API Gateway
    participant Validator as ✅ Validator
    participant Compiler as ⚙️ Batch Compiler
    participant Cache as 💾 KV Cache
    participant Sources as 🌍 External Sources
    
    Note over Client,Sources: Synchronous Batch Compilation Flow
    
    Client->>API: POST /compile/batch
    Note right of Client: Request with 1-10 items
    
    API->>Validator: Validate request
    
    alt Invalid request
        Validator-->>API: ❌ Validation errors
        API-->>Client: 400 Bad Request
    else Valid request
        Validator-->>API: ✅ Valid
        
        API->>Compiler: Start batch compilation
        
        Note over Compiler: Process items in parallel
        
        loop For each item
            Compiler->>Cache: Check cache
            
            alt Cache hit
                Cache-->>Compiler: ⚡ Cached result
            else Cache miss
                Cache-->>Compiler: 🚫 Not cached
                
                Compiler->>Sources: Fetch filter lists
                Sources-->>Compiler: 📥 Raw content
                
                Compiler->>Compiler: Apply transformations
                Compiler->>Cache: 💾 Store result
            end
        end
        
        Compiler-->>API: ✅ All results
        API-->>Client: 200 OK with results array
    end
    
    Note over Client,Sources: Total time: typically 2-30 seconds

Asynchronous Batch Flow (Detailed)

sequenceDiagram
    participant Client as 👤 Client
    participant API as 🌐 API Gateway
    participant Queue as 📬 Cloudflare Queue
    participant Worker as 🔄 Queue Consumer
    participant Compiler as ⚙️ Batch Compiler
    participant Cache as 💾 KV Cache
    
    Note over Client,Cache: Asynchronous Batch Compilation Flow
    
    Client->>API: POST /compile/batch/async
    Note right of Client: Request with 1-10 items
    
    API->>API: Generate request ID
    Note right of API: requestId: req-{timestamp}-{random}
    
    API->>Queue: Enqueue batch message
    Note right of Queue: Priority: standard or high
    
    Queue-->>API: ✅ Queued successfully
    API-->>Client: 202 Accepted
    Note left of API: Response includes:<br/>- requestId<br/>- priority<br/>- status
    
    Note over Client: Client can continue other work
    
    rect rgb(240, 240, 255)
        Note over Queue,Cache: Background Processing (async)
        
        Queue->>Queue: Batch messages
        Note right of Queue: Wait for batch timeout<br/>or max batch size
        
        Queue->>Worker: Deliver message batch
        
        Worker->>Compiler: Process batch
        
        loop For each item in batch
            Compiler->>Compiler: Compile filter list
            Compiler->>Cache: Store results
        end
        
        Worker->>Cache: Mark as completed
        Worker->>Queue: Acknowledge message
    end
    
    Note over Client: Later: client checks for results
    
    Client->>API: GET /queue/results/{requestId}
    API->>Cache: Lookup results
    
    alt Results ready
        Cache-->>API: ✅ Compilation results
        API-->>Client: 200 OK with results
    else Still processing
        Cache-->>API: ⏳ Not ready yet
        API-->>Client: 200 OK (status: processing)
    else Not found
        Cache-->>API: 🚫 Not found
        API-->>Client: 404 Not Found
    end

Priority Queue Routing

graph TB
    subgraph "Request Input"
        REQ[📨 Batch Request]
        PRIO{Priority<br/>Specified?}
    end
    
    subgraph "High Priority Path"
        HPQ[⚡ High Priority Queue]
        HPC[Fast Consumer<br/>Batch: 5<br/>Timeout: 2s]
        HPP[Quick Processing]
    end
    
    subgraph "Standard Priority Path"
        SPQ[📋 Standard Queue]
        SPC[Normal Consumer<br/>Batch: 10<br/>Timeout: 5s]
        SPP[Normal Processing]
    end
    
    subgraph "Processing Results"
        CACHE[💾 Cache Results]
        METRICS[📊 Record Metrics]
    end
    
    REQ --> PRIO
    PRIO -->|priority: high| HPQ
    PRIO -->|priority: standard<br/>or not specified| SPQ
    
    HPQ --> HPC
    HPC --> HPP
    
    SPQ --> SPC
    SPC --> SPP
    
    HPP --> CACHE
    SPP --> CACHE
    CACHE --> METRICS
    
    style REQ fill:#667eea,stroke:#333,stroke-width:2px,color:#fff
    style PRIO fill:#f59e0b,stroke:#333,stroke-width:2px,color:#000
    style HPQ fill:#ef4444,stroke:#333,stroke-width:2px,color:#fff
    style HPC fill:#ef4444,stroke:#333,stroke-width:2px,color:#fff
    style HPP fill:#ef4444,stroke:#333,stroke-width:2px,color:#fff
    style SPQ fill:#10b981,stroke:#333,stroke-width:2px,color:#fff
    style SPC fill:#10b981,stroke:#333,stroke-width:2px,color:#fff
    style SPP fill:#10b981,stroke:#333,stroke-width:2px,color:#fff
    style CACHE fill:#3b82f6,stroke:#333,stroke-width:2px,color:#fff
    style METRICS fill:#8b5cf6,stroke:#333,stroke-width:2px,color:#fff

Code Examples

Example 1: Simple Synchronous Batch

Scenario: Compile 3 filter lists and get immediate results

graph LR
    subgraph "Your Code"
        CODE[📝 Make API Call]
    end
    
    subgraph "API Processing"
        PROC[⚙️ Compile 3 Lists<br/>Parallel execution]
    end
    
    subgraph "Results"
        RES[✅ 3 Compiled Lists<br/>Immediately returned]
    end
    
    CODE -->|POST request| PROC
    PROC -->|2-10 seconds| RES
    
    style CODE fill:#667eea,stroke:#333,stroke-width:2px,color:#fff
    style PROC fill:#10b981,stroke:#333,stroke-width:2px,color:#fff
    style RES fill:#3b82f6,stroke:#333,stroke-width:2px,color:#fff
// JavaScript/TypeScript example
const batchRequest = {
    requests: [
        {
            id: 'adguard-dns',
            configuration: {
                name: 'AdGuard DNS Filter',
                sources: [
                    {
                        source: 'https://adguardteam.github.io/AdGuardSDNSFilter/Filters/filter.txt',
                        transformations: ['RemoveComments', 'Validate']
                    }
                ],
                transformations: ['Deduplicate', 'RemoveEmptyLines']
            },
            benchmark: true
        },
        {
            id: 'easylist',
            configuration: {
                name: 'EasyList',
                sources: [
                    {
                        source: 'https://easylist.to/easylist/easylist.txt',
                        transformations: ['RemoveComments', 'Compress']
                    }
                ],
                transformations: ['Deduplicate']
            }
        },
        {
            id: 'custom-rules',
            configuration: {
                name: 'Custom Rules',
                sources: [
                    { source: 'my-custom-rules' }
                ]
            },
            preFetchedContent: {
                'my-custom-rules': '||ads.example.com^\n||tracking.example.com^'
            }
        }
    ]
};

// Send synchronous batch request
const response = await fetch('https://adblock-compiler.jayson-knight.workers.dev/compile/batch', {
    method: 'POST',
    headers: {
        'Content-Type': 'application/json'
    },
    body: JSON.stringify(batchRequest)
});

const results = await response.json();

// Process results
console.log('Batch compilation complete!');
results.results.forEach(result => {
    console.log(`${result.id}: ${result.ruleCount} rules`);
    console.log(`Compilation time: ${result.metrics?.totalDurationMs}ms`);
});

Expected Response:

{
    "success": true,
    "results": [
        {
            "id": "adguard-dns",
            "success": true,
            "rules": ["||ads.com^", "||tracker.net^", "..."],
            "ruleCount": 45234,
            "metrics": {
                "totalDurationMs": 2341,
                "sourceCount": 1,
                "transformationMetrics": [...]
            },
            "compiledAt": "2026-01-14T07:30:15.123Z"
        },
        {
            "id": "easylist",
            "success": true,
            "rules": ["||ad.example.com^", "..."],
            "ruleCount": 67891,
            "metrics": {
                "totalDurationMs": 3567
            },
            "compiledAt": "2026-01-14T07:30:16.234Z"
        },
        {
            "id": "custom-rules",
            "success": true,
            "rules": ["||ads.example.com^", "||tracking.example.com^"],
            "ruleCount": 2,
            "metrics": {
                "totalDurationMs": 45
            },
            "compiledAt": "2026-01-14T07:30:15.456Z"
        }
    ]
}

Example 2: Asynchronous Batch with Polling

Scenario: Queue 10 large filter lists for background processing

sequenceDiagram
    participant Code as 📝 Your Code
    participant API as 🌐 API
    participant Queue as 📬 Queue
    
    Note over Code,Queue: Step 1: Queue the batch
    Code->>API: POST /compile/batch/async
    API->>Queue: Enqueue
    API-->>Code: 202 Accepted<br/>{requestId: "req-123"}
    
    Note over Code: Your code continues...<br/>Do other work
    
    Note over Queue: Background: Processing...
    
    Note over Code,Queue: Step 2: Poll for results (after 30s)
    Code->>API: GET /queue/results/req-123
    API-->>Code: 200 OK<br/>{status: "processing"}
    
    Note over Code: Wait 30 more seconds
    
    Note over Queue: Compilation complete!
    
    Note over Code,Queue: Step 3: Get final results
    Code->>API: GET /queue/results/req-123
    API-->>Code: 200 OK<br/>{status: "completed", results: [...]}
// JavaScript/TypeScript example with async/await
async function compileBatchAsync() {
    // Step 1: Queue the batch
    const batchRequest = {
        requests: [
            // ... 10 compilation requests
            { id: 'list-1', configuration: { /* ... */ } },
            { id: 'list-2', configuration: { /* ... */ } },
            { id: 'list-3', configuration: { /* ... */ } },
            // ... up to list-10
        ]
    };
    
    const queueResponse = await fetch(
        'https://adblock-compiler.jayson-knight.workers.dev/compile/batch/async',
        {
            method: 'POST',
            headers: { 'Content-Type': 'application/json' },
            body: JSON.stringify(batchRequest)
        }
    );
    
    const queueData = await queueResponse.json();
    console.log('Batch queued:', queueData.requestId);
    
    // Step 2: Poll for results
    const requestId = queueData.requestId;
    let results = null;
    let attempts = 0;
    const maxAttempts = 10;
    
    while (!results && attempts < maxAttempts) {
        // Wait 30 seconds between polls
        await new Promise(resolve => setTimeout(resolve, 30000));
        
        const statusResponse = await fetch(
            `https://adblock-compiler.jayson-knight.workers.dev/queue/results/${requestId}`
        );
        
        const statusData = await statusResponse.json();
        
        if (statusData.status === 'completed') {
            results = statusData.results;
            console.log('Batch complete! Got results for', results.length, 'items');
        } else if (statusData.status === 'failed') {
            throw new Error('Batch compilation failed: ' + statusData.error);
        } else {
            console.log('Still processing... attempt', ++attempts);
        }
    }
    
    if (!results) {
        throw new Error('Timeout waiting for results');
    }
    
    return results;
}

// Usage
try {
    const results = await compileBatchAsync();
    results.forEach(result => {
        console.log(`${result.id}: ${result.ruleCount} rules`);
    });
} catch (error) {
    console.error('Batch compilation error:', error);
}

Example 3: Python with Requests Library

import requests
import time
from typing import List, Dict

BASE_URL = 'https://adblock-compiler.jayson-knight.workers.dev'

def compile_batch_async(requests_data: List[Dict]) -> List[Dict]:
    """
    Compile multiple filter lists asynchronously
    
    Args:
        requests_data: List of compilation requests (max 10)
    
    Returns:
        List of compilation results
    """
    
    # Step 1: Queue the batch
    response = requests.post(
        f'{BASE_URL}/compile/batch/async',
        json={'requests': requests_data}
    )
    response.raise_for_status()
    
    queue_data = response.json()
    request_id = queue_data['requestId']
    print(f'📬 Batch queued: {request_id}')
    print(f'⚡ Priority: {queue_data["priority"]}')
    
    # Step 2: Poll for results
    max_attempts = 20
    poll_interval = 30  # seconds
    
    for attempt in range(max_attempts):
        print(f'⏳ Checking status (attempt {attempt + 1}/{max_attempts})...')
        
        response = requests.get(f'{BASE_URL}/queue/results/{request_id}')
        response.raise_for_status()
        
        data = response.json()
        
        if data.get('status') == 'completed':
            print('✅ Batch compilation complete!')
            return data['results']
        elif data.get('status') == 'failed':
            raise Exception(f'Batch failed: {data.get("error")}')
        else:
            if attempt < max_attempts - 1:
                print(f'⌛ Still processing, waiting {poll_interval} seconds...')
                time.sleep(poll_interval)
    
    raise TimeoutError('Timeout waiting for batch completion')


# Example usage
if __name__ == '__main__':
    batch_requests = [
        {
            'id': 'adguard',
            'configuration': {
                'name': 'AdGuard DNS',
                'sources': [
                    {
                        'source': 'https://adguardteam.github.io/AdGuardSDNSFilter/Filters/filter.txt'
                    }
                ],
                'transformations': ['Deduplicate', 'RemoveEmptyLines']
            },
            'benchmark': True
        },
        {
            'id': 'easylist',
            'configuration': {
                'name': 'EasyList',
                'sources': [
                    {
                        'source': 'https://easylist.to/easylist/easylist.txt'
                    }
                ],
                'transformations': ['Deduplicate']
            }
        }
    ]
    
    try:
        results = compile_batch_async(batch_requests)
        
        print('\n📊 Results Summary:')
        for result in results:
            print(f"  {result['id']}: {result['ruleCount']} rules")
            print(f"    Time: {result['metrics']['totalDurationMs']}ms")
    
    except Exception as e:
        print(f'❌ Error: {e}')

Example 4: cURL Commands

# Example: Synchronous batch compilation
curl -X POST https://adblock-compiler.jayson-knight.workers.dev/compile/batch \
  -H "Content-Type: application/json" \
  -d '{
    "requests": [
      {
        "id": "test-1",
        "configuration": {
          "name": "Test List 1",
          "sources": [
            {
              "source": "my-rules-1"
            }
          ]
        },
        "preFetchedContent": {
          "my-rules-1": "||ads.com^\n||tracker.net^"
        }
      },
      {
        "id": "test-2",
        "configuration": {
          "name": "Test List 2",
          "sources": [
            {
              "source": "my-rules-2"
            }
          ]
        },
        "preFetchedContent": {
          "my-rules-2": "||spam.org^\n||malware.biz^"
        }
      }
    ]
  }'
# Example: Asynchronous batch compilation

# Step 1: Queue the batch
curl -X POST https://adblock-compiler.jayson-knight.workers.dev/compile/batch/async \
  -H "Content-Type: application/json" \
  -d '{
    "requests": [
      {
        "id": "large-list-1",
        "configuration": {
          "name": "Large Filter List",
          "sources": [
            {
              "source": "https://example.com/large-list.txt"
            }
          ],
          "transformations": ["Deduplicate", "Compress"]
        }
      }
    ]
  }'

# Response will include a requestId, e.g.:
# {
#   "success": true,
#   "requestId": "req-1704931200000-abc123",
#   "priority": "standard"
# }

# Step 2: Check status (wait 30 seconds, then run this)
curl https://adblock-compiler.jayson-knight.workers.dev/queue/results/req-1704931200000-abc123

# If still processing, you'll get:
# {
#   "success": true,
#   "status": "processing"
# }

# When complete, you'll get full results:
# {
#   "success": true,
#   "status": "completed",
#   "results": [...]
# }

Best Practices

Batch Size Optimization

graph TB
    subgraph "Batch Size Decision Tree"
        START{How many<br/>lists?}
        
        START -->|1-3 items| SMALL[Small Batch]
        START -->|4-7 items| MEDIUM[Medium Batch]
        START -->|8-10 items| LARGE[Large Batch]
        START -->|>10 items| SPLIT[Split into<br/>multiple batches]
        
        SMALL --> SYNC1[✅ Use Sync API<br/>Fast response]
        MEDIUM --> CHOICE{Need immediate<br/>results?}
        LARGE --> ASYNC1[✅ Use Async API<br/>Reliable processing]
        SPLIT --> ASYNC2[✅ Use Async API<br/>Process separately]
        
        CHOICE -->|Yes| SYNC2[Use Sync API<br/>May be slower]
        CHOICE -->|No| ASYNC3[✅ Use Async API<br/>Recommended]
    end
    
    style START fill:#667eea,stroke:#333,stroke-width:2px,color:#fff
    style SMALL fill:#10b981,stroke:#333,stroke-width:2px,color:#fff
    style MEDIUM fill:#f59e0b,stroke:#333,stroke-width:2px,color:#fff
    style LARGE fill:#ef4444,stroke:#333,stroke-width:2px,color:#fff
    style SPLIT fill:#ef4444,stroke:#333,stroke-width:2px,color:#fff
    style SYNC1 fill:#3b82f6,stroke:#333,stroke-width:2px,color:#fff
    style SYNC2 fill:#3b82f6,stroke:#333,stroke-width:2px,color:#fff
    style ASYNC1 fill:#8b5cf6,stroke:#333,stroke-width:2px,color:#fff
    style ASYNC2 fill:#8b5cf6,stroke:#333,stroke-width:2px,color:#fff
    style ASYNC3 fill:#8b5cf6,stroke:#333,stroke-width:2px,color:#fff

Error Handling Strategy

graph TB
    subgraph "Error Handling Flow"
        REQ[📨 Send Batch Request]
        
        REQ --> CHECK{Response<br/>Status?}
        
        CHECK -->|400| VAL_ERR[❌ Validation Error]
        CHECK -->|429| RATE_ERR[❌ Rate Limit]
        CHECK -->|500| SRV_ERR[❌ Server Error]
        CHECK -->|200/202| SUCCESS[✅ Success]
        
        VAL_ERR --> FIX1[Fix request format<br/>Check item count]
        RATE_ERR --> WAIT1[Wait 60 seconds<br/>Retry with backoff]
        SRV_ERR --> RETRY1[Retry with<br/>exponential backoff]
        
        SUCCESS --> PROCESS{Processing<br/>Results}
        
        PROCESS --> ITEM_ERR{Any item<br/>failed?}
        ITEM_ERR -->|Yes| LOG[Log failure<br/>Continue with successful]
        ITEM_ERR -->|No| DONE[✅ All items<br/>successful]
    end
    
    style REQ fill:#667eea,stroke:#333,stroke-width:2px,color:#fff
    style CHECK fill:#f59e0b,stroke:#333,stroke-width:2px,color:#000
    style VAL_ERR fill:#ef4444,stroke:#333,stroke-width:2px,color:#fff
    style RATE_ERR fill:#ef4444,stroke:#333,stroke-width:2px,color:#fff
    style SRV_ERR fill:#ef4444,stroke:#333,stroke-width:2px,color:#fff
    style SUCCESS fill:#10b981,stroke:#333,stroke-width:2px,color:#fff
    style DONE fill:#10b981,stroke:#333,stroke-width:2px,color:#fff

Caching Strategy

graph LR
    subgraph "How Caching Works in Batches"
        REQ[📨 Batch Request<br/>3 items]
        
        REQ --> ITEM1[Item 1]
        REQ --> ITEM2[Item 2]
        REQ --> ITEM3[Item 3]
        
        ITEM1 --> CACHE1{Cache<br/>Hit?}
        ITEM2 --> CACHE2{Cache<br/>Hit?}
        ITEM3 --> CACHE3{Cache<br/>Hit?}
        
        CACHE1 -->|Yes| HIT1[⚡ Return cached<br/>~10ms]
        CACHE1 -->|No| COMPILE1[⚙️ Compile<br/>~2000ms]
        
        CACHE2 -->|Yes| HIT2[⚡ Return cached<br/>~10ms]
        CACHE2 -->|No| COMPILE2[⚙️ Compile<br/>~3000ms]
        
        CACHE3 -->|Yes| HIT3[⚡ Return cached<br/>~10ms]
        CACHE3 -->|No| COMPILE3[⚙️ Compile<br/>~1500ms]
        
        HIT1 --> RESULT
        COMPILE1 --> STORE1[💾 Cache for 1hr]
        STORE1 --> RESULT
        
        HIT2 --> RESULT
        COMPILE2 --> STORE2[💾 Cache for 1hr]
        STORE2 --> RESULT
        
        HIT3 --> RESULT
        COMPILE3 --> STORE3[💾 Cache for 1hr]
        STORE3 --> RESULT[📥 Return all results]
    end
    
    style REQ fill:#667eea,stroke:#333,stroke-width:2px,color:#fff
    style HIT1 fill:#10b981,stroke:#333,stroke-width:2px,color:#fff
    style HIT2 fill:#10b981,stroke:#333,stroke-width:2px,color:#fff
    style HIT3 fill:#10b981,stroke:#333,stroke-width:2px,color:#fff
    style COMPILE1 fill:#f59e0b,stroke:#333,stroke-width:2px,color:#fff
    style COMPILE2 fill:#f59e0b,stroke:#333,stroke-width:2px,color:#fff
    style COMPILE3 fill:#f59e0b,stroke:#333,stroke-width:2px,color:#fff
    style RESULT fill:#3b82f6,stroke:#333,stroke-width:2px,color:#fff

Performance Tips

mindmap
    root((Performance<br/>Tips))
        Request Optimization
            Use unique IDs
            Group similar lists
            Enable benchmarking for metrics
            Reuse configurations
        Caching
            Identical configs = cache hit
            1 hour TTL
            Check X-Cache header
            Warm cache with async
        Polling Strategy
            Start with 30s intervals
            Increase to 60s after 3 attempts
            Max 10-20 attempts
            Use webhooks when available
        Error Handling
            Retry with exponential backoff
            Handle partial failures
            Log all errors
            Monitor queue stats

Troubleshooting

Common Issues and Solutions

graph TB
    subgraph "Common Problems & Solutions"
        P1[❌ 400: Too many items]
        P2[❌ 400: Invalid configuration]
        P3[❌ 429: Rate limit exceeded]
        P4[❌ 404: Results not found]
        P5[⏳ Async taking too long]
        P6[❌ Partial failures]
        
        P1 --> S1[✅ Split batch into<br/>multiple requests<br/>Max 10 items per batch]
        P2 --> S2[✅ Validate JSON schema<br/>Check required fields<br/>Use OpenAPI spec]
        P3 --> S3[✅ Wait 60 seconds<br/>Use async API<br/>Implement backoff]
        P4 --> S4[✅ Results expired after 24h<br/>Check requestId spelling<br/>Re-run compilation]
        P5 --> S5[✅ Large lists take time<br/>Check queue stats<br/>Use high priority]
        P6 --> S6[✅ Check each item.success<br/>Successful items still returned<br/>Retry failed items]
    end
    
    style P1 fill:#ef4444,stroke:#333,stroke-width:2px,color:#fff
    style P2 fill:#ef4444,stroke:#333,stroke-width:2px,color:#fff
    style P3 fill:#ef4444,stroke:#333,stroke-width:2px,color:#fff
    style P4 fill:#ef4444,stroke:#333,stroke-width:2px,color:#fff
    style P5 fill:#f59e0b,stroke:#333,stroke-width:2px,color:#fff
    style P6 fill:#f59e0b,stroke:#333,stroke-width:2px,color:#fff
    style S1 fill:#10b981,stroke:#333,stroke-width:2px,color:#fff
    style S2 fill:#10b981,stroke:#333,stroke-width:2px,color:#fff
    style S3 fill:#10b981,stroke:#333,stroke-width:2px,color:#fff
    style S4 fill:#10b981,stroke:#333,stroke-width:2px,color:#fff
    style S5 fill:#10b981,stroke:#333,stroke-width:2px,color:#fff
    style S6 fill:#10b981,stroke:#333,stroke-width:2px,color:#fff

Debugging Workflow

graph TB
    START[🐛 Issue Detected]
    
    START --> STEP1{Check<br/>Response<br/>Status}
    
    STEP1 -->|4xx| CLIENT[Client Error]
    STEP1 -->|5xx| SERVER[Server Error]
    STEP1 -->|2xx| SUCCESS[Request OK]
    
    CLIENT --> CHECK_REQ[Review request body<br/>Validate against schema<br/>Check item count]
    SERVER --> CHECK_STATUS[Check queue stats<br/>Check worker health<br/>Retry request]
    SUCCESS --> CHECK_RESULTS{All items<br/>successful?}
    
    CHECK_RESULTS -->|No| PARTIAL[Partial Failure]
    CHECK_RESULTS -->|Yes| GOOD[✅ All Good!]
    
    PARTIAL --> ANALYZE[Analyze failed items<br/>Check error messages<br/>Retry individually]
    
    CHECK_REQ --> FIX[Fix and retry]
    CHECK_STATUS --> CONTACT[Contact support<br/>if persists]
    ANALYZE --> FIX
    
    style START fill:#667eea,stroke:#333,stroke-width:3px,color:#fff
    style CLIENT fill:#ef4444,stroke:#333,stroke-width:2px,color:#fff
    style SERVER fill:#ef4444,stroke:#333,stroke-width:2px,color:#fff
    style SUCCESS fill:#10b981,stroke:#333,stroke-width:2px,color:#fff
    style GOOD fill:#10b981,stroke:#333,stroke-width:2px,color:#fff
    style PARTIAL fill:#f59e0b,stroke:#333,stroke-width:2px,color:#fff
    style FIX fill:#3b82f6,stroke:#333,stroke-width:2px,color:#fff

Queue Status Monitoring

graph LR
    subgraph "Monitor Queue Health"
        API[🌐 GET /queue/stats]
        
        API --> METRICS[📊 Queue Metrics]
        
        METRICS --> PENDING[📋 Pending Jobs<br/>Currently queued]
        METRICS --> PROCESSING[⚙️ Processing Rate<br/>Jobs per minute]
        METRICS --> COMPLETED[✅ Completed Count<br/>Success total]
        METRICS --> FAILED[❌ Failed Count<br/>Error total]
        METRICS --> LAG[⏱️ Queue Lag<br/>Avg wait time]
        
        PENDING --> HEALTH{Queue<br/>Health?}
        LAG --> HEALTH
        
        HEALTH -->|Good| OK[✅ Normal Operation<br/>Lag < 5 seconds<br/>Pending < 100]
        HEALTH -->|Warning| WARN[⚠️ High Load<br/>Lag 5-30 seconds<br/>Pending 100-500]
        HEALTH -->|Critical| CRIT[🚨 Overloaded<br/>Lag > 30 seconds<br/>Pending > 500]
    end
    
    style API fill:#667eea,stroke:#333,stroke-width:2px,color:#fff
    style METRICS fill:#3b82f6,stroke:#333,stroke-width:2px,color:#fff
    style OK fill:#10b981,stroke:#333,stroke-width:2px,color:#fff
    style WARN fill:#f59e0b,stroke:#333,stroke-width:2px,color:#fff
    style CRIT fill:#ef4444,stroke:#333,stroke-width:2px,color:#fff

Quick Reference

API Endpoints Summary

EndpointMethodPurposeReturns
/compile/batchPOSTSynchronous batch compilationImmediate results
/compile/batch/asyncPOSTAsynchronous batch compilationRequest ID
/queue/results/:idGETGet async resultsResults or status
/queue/statsGETQueue statisticsMetrics

Request Limits

graph LR
    subgraph "Batch API Limits"
        L1[📊 Max Items: 10<br/>per batch]
        L2[⏱️ Sync Timeout: 30s<br/>total execution]
        L3[🚦 Rate Limit: 10<br/>requests/minute]
        L4[📦 Max Size: 1MB<br/>request body]
        L5[💾 Cache TTL: 1 hour<br/>result storage]
        L6[📁 Result TTL: 24 hours<br/>async results]
    end
    
    style L1 fill:#667eea,stroke:#333,stroke-width:2px,color:#fff
    style L2 fill:#667eea,stroke:#333,stroke-width:2px,color:#fff
    style L3 fill:#667eea,stroke:#333,stroke-width:2px,color:#fff
    style L4 fill:#667eea,stroke:#333,stroke-width:2px,color:#fff
    style L5 fill:#667eea,stroke:#333,stroke-width:2px,color:#fff
    style L6 fill:#667eea,stroke:#333,stroke-width:2px,color:#fff

Decision Matrix

graph TB
    subgraph "Choose the Right API"
        Q1{How many<br/>filter lists?}
        Q2{Need results<br/>immediately?}
        Q3{Lists are<br/>large/slow?}
        
        Q1 -->|1| SINGLE[Use /compile]
        Q1 -->|2-10| Q2
        Q1 -->|>10| MULTI[Split into<br/>multiple batches]
        
        Q2 -->|Yes| Q3
        Q2 -->|No| ASYNC_B[✅ /compile/batch/async]
        
        Q3 -->|Yes| ASYNC_B2[✅ /compile/batch/async]
        Q3 -->|No| SYNC_B[✅ /compile/batch]
    end
    
    style Q1 fill:#f59e0b,stroke:#333,stroke-width:2px,color:#000
    style Q2 fill:#f59e0b,stroke:#333,stroke-width:2px,color:#000
    style Q3 fill:#f59e0b,stroke:#333,stroke-width:2px,color:#000
    style SINGLE fill:#3b82f6,stroke:#333,stroke-width:2px,color:#fff
    style SYNC_B fill:#10b981,stroke:#333,stroke-width:2px,color:#fff
    style ASYNC_B fill:#8b5cf6,stroke:#333,stroke-width:2px,color:#fff
    style ASYNC_B2 fill:#8b5cf6,stroke:#333,stroke-width:2px,color:#fff
    style MULTI fill:#ef4444,stroke:#333,stroke-width:2px,color:#fff


Need Help?


Last updated: 2026-01-14

OpenAPI Support in Adblock Compiler

Summary

Yes, this package fully supports OpenAPI 3.0.3!

The Adblock Compiler includes comprehensive OpenAPI documentation and tooling for the REST API. This support was already implemented but wasn't prominently featured in the main README, so we've enhanced the documentation to make it more discoverable.

What's Included

1. OpenAPI Specification (docs/api/openapi.yaml)

A complete OpenAPI 3.0.3 specification documenting:

  • 10 API endpoints including compilation, streaming, batch processing, queues, and metrics
  • 25+ schema definitions with detailed request/response types
  • Security schemes (Cloudflare Turnstile support)
  • Server configurations for production and local development
  • WebSocket documentation for real-time bidirectional communication
  • Error responses with proper status codes and schemas
  • Request examples for key endpoints

Validation Status: ✅ Valid (0 errors, 35 minor warnings about schema descriptions)

2. Validation Tools

# Validate the OpenAPI specification
deno task openapi:validate

The validation script checks:

  • YAML syntax
  • OpenAPI version compatibility
  • Required fields completeness
  • Unique operation IDs
  • Response definitions
  • Best practices compliance

3. Documentation Generation

# Generate interactive HTML documentation
deno task openapi:docs

Generates:

  • Interactive HTML docs using Redoc at docs/api/index.html
  • Markdown reference at docs/api/README.md

Features:

  • 🔍 Search functionality
  • 📱 Responsive design
  • 🎨 Code samples
  • 📊 Interactive schema browser
  • 🔗 Deep linking

4. Contract Testing

# Run contract tests against the API
deno task test:contract

Tests validate that the live API conforms to the OpenAPI specification:

  • Response status codes match spec
  • Response content types are correct
  • Required fields are present
  • Data types match schemas
  • Headers conform to spec

5. Comprehensive Documentation

API Endpoints Documented

Compilation Endpoints

  • POST /compile - Synchronous compilation with JSON response
  • POST /compile/stream - Real-time streaming via Server-Sent Events (SSE)
  • POST /compile/batch - Batch processing (up to 10 lists in parallel)

Async Queue Operations

  • POST /compile/async - Queue async compilation job
  • POST /compile/batch/async - Queue batch compilation
  • GET /queue/stats - Queue health metrics
  • GET /queue/results/{requestId} - Retrieve job results

WebSocket

  • GET /ws/compile - Bidirectional real-time communication

Metrics & Monitoring

  • GET /api - API information and version
  • GET /metrics - Performance metrics

Using the OpenAPI Spec

1. Generate Client SDKs

Use the OpenAPI spec to generate client libraries in multiple languages:

# TypeScript/JavaScript
openapi-generator-cli generate -i docs/api/openapi.yaml -g typescript-fetch -o ./client

# Python
openapi-generator-cli generate -i docs/api/openapi.yaml -g python -o ./client

# Go
openapi-generator-cli generate -i docs/api/openapi.yaml -g go -o ./client

# And many more languages...

2. Import into API Testing Tools

Postman:

File → Import → docs/api/openapi.yaml

Insomnia:

Create → Import From → File → docs/api/openapi.yaml

Swagger UI: Host the docs/api/openapi.yaml file and point Swagger UI to it.

3. API Client Testing

# Test against production
curl https://adblock-compiler.jayson-knight.workers.dev/api

# Get API information
curl -X POST https://adblock-compiler.jayson-knight.workers.dev/compile \
  -H "Content-Type: application/json" \
  -d @request.json

4. CI/CD Integration

The OpenAPI validation and contract tests can be integrated into your CI/CD pipeline:

# Example GitHub Actions workflow
- name: Validate OpenAPI spec
  run: deno task openapi:validate

- name: Generate documentation
  run: deno task openapi:docs

- name: Run contract tests
  run: deno task test:contract

Quick Start

# 1. Validate the OpenAPI specification
deno task openapi:validate

# 2. Generate interactive documentation
deno task openapi:docs

# 3. View the documentation
open docs/api/index.html

# 4. Run contract tests
deno task test:contract

Live Resources

  • Production API: https://adblock-compiler.jayson-knight.workers.dev/api
  • Web UI: https://adblock-compiler.jayson-knight.workers.dev/
  • OpenAPI Spec: openapi.yaml
  • Generated Docs: index.html

What Changed in This PR

To make OpenAPI support more discoverable, we:

  1. ✅ Added OpenAPI 3.0.3 badge to README
  2. ✅ Added OpenAPI to the Features list
  3. ✅ Created dedicated "OpenAPI Specification" section in README
  4. ✅ Linked to existing comprehensive documentation
  5. ✅ Added examples of using the OpenAPI spec with code generation tools
  6. ✅ Verified validation and documentation generation works

Conclusion

The Adblock Compiler has excellent OpenAPI support with:

  • Complete API documentation
  • Validation tooling
  • Contract testing
  • Documentation generation
  • Integration with standard OpenAPI ecosystem tools

All the infrastructure was already in place—we've just made it more visible in the main documentation!

Learn More

OpenAPI Tooling Guide

Complete guide to validating, testing, and documenting the Adblock Compiler API using the OpenAPI specification.

📋 Table of Contents

Overview

The Adblock Compiler API is fully documented using the OpenAPI 3.0.3 specification (docs/api/openapi.yaml). This specification serves as the single source of truth for:

  • API endpoint definitions
  • Request/response schemas
  • Authentication requirements
  • Error responses
  • Examples and documentation

Validation

Validate OpenAPI Spec

Ensure your docs/api/openapi.yaml conforms to the OpenAPI specification:

# Run validation
deno task openapi:validate

# Or directly
./scripts/validate-openapi.ts

What it checks:

  • ✅ YAML syntax
  • ✅ OpenAPI version compatibility
  • ✅ Required fields (info, paths, etc.)
  • ✅ Unique operation IDs
  • ✅ Response definitions
  • ✅ Schema completeness
  • ✅ Best practices compliance

Example output:

🔍 Validating OpenAPI specification...

✅ YAML syntax is valid
✅ OpenAPI version: 3.0.3
✅ Title: Adblock Compiler API
✅ Version: 2.0.0
✅ Servers: 2 defined
✅ Paths: 10 endpoints defined
✅ Operations: 13 total
✅ Schemas: 30 defined
✅ Security schemes: 1 defined
✅ Tags: 5 defined

📋 Checking best practices...

✅ Request examples: 2 found
✅ Contact info provided
✅ License: GPL-3.0

============================================================
VALIDATION RESULTS
============================================================

✅ OpenAPI specification is VALID!

Summary: 0 errors, 0 warnings

Pre-commit Validation

Add to your git hooks:

#!/bin/sh
# .git/hooks/pre-commit
deno task openapi:validate || exit 1

Documentation Generation

Generate HTML Documentation

Create beautiful, interactive API documentation using Redoc:

# Generate docs
deno task openapi:docs

# Or directly
./scripts/generate-docs.ts

Output files:

  • docs/api/index.html - Interactive HTML documentation (Redoc)
  • docs/api/README.md - Markdown reference documentation

Generate Cloudflare API Shield Schema

Generate a Cloudflare-compatible schema for use with Cloudflare's API Shield Schema Validation:

# Generate Cloudflare schema
deno task schema:cloudflare

# Or directly
./scripts/generate-cloudflare-schema.ts

What it does:

  • ✅ Filters out localhost servers (keeps only production/staging URLs)
  • ✅ Removes non-standard x-* extension fields from operations
  • ✅ Generates docs/api/cloudflare-schema.yaml ready for API Shield

Why use this: Cloudflare's API Shield Schema Validation provides request/response validation at the edge. The generated schema is optimized for Cloudflare's parser by removing development servers and custom extensions that may not be compatible.

Learn more: Cloudflare API Shield Schema Validation

CI/CD Integration: The schema generation is validated in CI to ensure it stays in sync with the main OpenAPI spec. If you update docs/api/openapi.yaml, you must regenerate the Cloudflare schema by running deno task schema:cloudflare and committing the result.

View Documentation

# Open HTML docs
open docs/api/index.html

# Or serve locally
python3 -m http.server 8000 --directory docs/api
# Then visit http://localhost:8000

Features

The generated HTML documentation includes:

  • 🔍 Search functionality - Find endpoints quickly
  • 📱 Responsive design - Works on mobile/tablet/desktop
  • 🎨 Code samples - Request/response examples
  • 📊 Schema explorer - Interactive schema browser
  • 🔗 Deep linking - Share links to specific endpoints
  • 📥 Download spec - Export OpenAPI YAML/JSON

Customization

Edit scripts/generate-docs.ts to customize:

  • Theme colors
  • Logo/branding
  • Sidebar configuration
  • Code sample languages

Contract Testing

Contract tests validate that your live API conforms to the OpenAPI specification.

Run Contract Tests

# Test against local server (default)
deno task test:contract

# Test against production
API_BASE_URL=https://adblock-compiler.jayson-knight.workers.dev deno task test:contract

# Test specific scenarios
deno test --allow-read --allow-write --allow-net --allow-env worker/openapi-contract.test.ts --filter "Contract: GET /api"

What's Tested

Core Endpoints:

  • ✅ GET /api - API info
  • ✅ GET /metrics - Performance metrics
  • ✅ POST /compile - Synchronous compilation
  • ✅ POST /compile/stream - SSE streaming
  • ✅ POST /compile/batch - Batch processing

Async Queue Operations (Cloudflare Queues):

  • ✅ POST /compile/async - Queue async job
  • ✅ POST /compile/batch/async - Queue batch job
  • ✅ GET /queue/stats - Queue statistics
  • ✅ GET /queue/results/{id} - Retrieve job results

Contract Validation:

  • ✅ Response status codes match spec
  • ✅ Response content types are correct
  • ✅ Required fields are present
  • ✅ Data types match schemas
  • ✅ Headers conform to spec (X-Cache, X-Request-Deduplication)
  • ✅ Error responses have proper structure

Async Testing with Queues

The contract tests properly validate Cloudflare Queue integration:

// Queues async compilation
const response = await apiRequest('/compile/async', {
    method: 'POST',
    body: JSON.stringify({ configuration, preFetchedContent }),
});

// Returns 202 if queues available, 500 if not configured
validateResponseStatus(response, [202, 500]);

if (response.status === 202) {
    const data = await response.json();
    // Validates requestId is returned
    validateBasicSchema(data, ['success', 'requestId', 'message']);
}

Queue Test Scenarios

  1. Standard Priority Queue

    • Tests default queue behavior
    • Validates requestId generation
    • Confirms job queuing
  2. High Priority Queue

    • Tests priority routing
    • Validates faster processing (when implemented)
  3. Batch Queue Operations

    • Tests multiple jobs queued together
    • Validates batch requestId tracking
  4. Queue Statistics

    • Validates queue depth metrics
    • Confirms job status tracking
    • Tests history retention

CI/CD Contract Testing

# .github/workflows/contract-tests.yml
name: Contract Tests

on: [push, pull_request]

jobs:
  contract-tests:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      
      - uses: denoland/setup-deno@v1
        with:
          deno-version: v2.x
      
      - name: Start local server
        run: deno task dev &
        
      - name: Wait for server
        run: sleep 5
        
      - name: Run contract tests
        run: deno task test:contract

Postman Testing

See POSTMAN_TESTING.md for complete Postman documentation.

Generate / Regenerate the Postman Collection

The Postman collection and environment files are auto-generated from docs/api/openapi.yaml. Do not edit them directly.

# Regenerate from the canonical OpenAPI spec
deno task postman:collection

This creates / updates:

  • docs/postman/postman-collection.json — all API requests with automated test assertions
  • docs/postman/postman-environment.json — local and production environment variables

The CI validate-postman-collection job regenerates the files and fails the build if the committed copies are out of sync with docs/api/openapi.yaml. Always run deno task postman:collection and commit the result whenever you change the spec.

Schema Hierarchy

docs/api/openapi.yaml                 ← canonical source of truth (edit this)
docs/api/cloudflare-schema.yaml       ← auto-generated (deno task schema:cloudflare)
docs/postman/postman-collection.json  ← auto-generated (deno task postman:collection)
docs/postman/postman-environment.json ← auto-generated (deno task postman:collection)

Quick Start

# Import collection and environment into Postman
# - docs/postman/postman-collection.json
# - docs/postman/postman-environment.json

# Or use Newman CLI
npm install -g newman
newman run docs/postman/postman-collection.json -e docs/postman/postman-environment.json

Postman Features

  • 🧪 25+ test requests
  • ✅ Automated assertions
  • 📊 Response validation
  • 🔄 Dynamic variables
  • 📈 Performance testing

CI/CD Integration

GitHub Actions

Complete pipeline for validation, testing, and documentation:

name: OpenAPI Pipeline

on: [push, pull_request]

jobs:
  validate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - uses: denoland/setup-deno@v1
      
      - name: Validate OpenAPI spec
        run: deno task openapi:validate

  validate-cloudflare-schema:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - uses: denoland/setup-deno@v1
      
      - name: Generate Cloudflare schema
        run: deno task schema:cloudflare
      
      - name: Check schema is up to date
        run: |
          if ! git diff --quiet docs/api/cloudflare-schema.yaml; then
            echo "❌ Cloudflare schema is out of date!"
            echo "Run 'deno task schema:cloudflare' and commit the result."
            exit 1
          fi

  generate-docs:
    needs: validate
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - uses: denoland/setup-deno@v1
      
      - name: Generate documentation
        run: deno task openapi:docs
      
      - name: Upload docs
        uses: actions/upload-artifact@v3
        with:
          name: api-docs
          path: docs/api/

  contract-tests:
    needs: validate
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - uses: denoland/setup-deno@v1
      
      - name: Start server
        run: deno task dev &
        
      - name: Wait for server
        run: sleep 10
      
      - name: Run contract tests
        run: deno task test:contract
        
  postman-tests:
    needs: validate
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      
      - name: Start server
        run: docker compose up -d
        
      - name: Install Newman
        run: npm install -g newman
        
      - name: Run Postman tests
        run: newman run docs/postman/postman-collection.json -e docs/postman/postman-environment.json --reporters cli,json
        
      - name: Upload results
        uses: actions/upload-artifact@v3
        with:
          name: newman-results
          path: newman/

Pre-deployment Checks

#!/bin/bash
# scripts/pre-deploy.sh

echo "🔍 Validating OpenAPI spec..."
deno task openapi:validate || exit 1

echo "☁️  Generating Cloudflare schema..."
deno task schema:cloudflare || exit 1

echo "📚 Generating documentation..."
deno task openapi:docs || exit 1

echo "🧪 Running contract tests..."
deno task test:contract || exit 1

echo "✅ All checks passed! Ready to deploy."

Best Practices

1. Keep Spec and Code in Sync

Problem: Spec drifts from actual implementation

Solution:

  • Run contract tests on every PR
  • Use CI/CD to block deployment if tests fail
  • Review OpenAPI changes alongside code changes
# Add to .git/hooks/pre-push
deno task openapi:validate
deno task test:contract

2. Version Your API

Current version: 2.0.0 in docs/api/openapi.yaml

When making breaking changes:

  1. Increment major version (2.0.0 → 3.0.0)
  2. Update info.version in docs/api/openapi.yaml
  3. Document changes in CHANGELOG.md
  4. Consider API versioning in URLs

3. Document Examples

Good:

requestBody:
  content:
    application/json:
      schema:
        $ref: '#/components/schemas/CompileRequest'
      examples:
        simple:
          summary: Simple compilation
          value:
            configuration:
              name: My Filter List
              sources:
                - source: test-rules

Why: Examples improve documentation and serve as test data.

4. Use Async Queues Appropriately

When to use Cloudflare Queues:

Use queues for:

  • Long-running compilations (>5 seconds)
  • Large batch operations
  • Background processing
  • Rate limit avoidance
  • Retry-able operations

Don't use queues for:

  • Quick operations (<1 second)
  • Real-time user interactions
  • Operations needing immediate feedback

Implementation:

// Queue job
const requestId = await queueCompileJob(env, configuration, preFetchedContent);

// Return immediately
return Response.json({
    success: true,
    requestId,
    message: 'Job queued for processing'
}, { status: 202 });

// Client polls for results
GET /queue/results/{requestId}

5. Test Queue Scenarios

Always test queue operations:

# Test queue availability
deno test --filter "Contract: POST /compile/async"

# Test queue stats
deno test --filter "Contract: GET /queue/stats"

# Test result retrieval
deno test --filter "Contract: GET /queue/results"

6. Monitor Queue Health

Track queue metrics:

  • Queue depth (pending jobs)
  • Processing rate (jobs/minute)
  • Average processing time
  • Failure rate
  • Retry rate

Access via: GET /queue/stats

7. Handle Queue Unavailability

Queues may not be configured in all environments:

if (!env.ADBLOCK_COMPILER_QUEUE) {
    return Response.json({
        success: false,
        error: 'Queue not available. Use synchronous endpoints instead.'
    }, { status: 500 });
}

Contract tests handle this gracefully:

validateResponseStatus(response, [202, 500]); // Both OK

Troubleshooting

Validation Fails

❌ Missing "operationId" for POST /compile

Fix: Add unique operationId to all operations in docs/api/openapi.yaml

Contract Tests Fail

Expected status 200, got 500

Fix:

  1. Check server logs
  2. Verify request body matches schema
  3. Ensure queue bindings configured (for async endpoints)

Documentation Not Generating

Failed to parse YAML

Fix: Validate YAML syntax:

deno task openapi:validate

Queue Tests Always Return 500

Cause: Cloudflare Queues not configured locally

Expected: Queues are production-only. Tests accept 202 OR 500.

Fix: Deploy to Cloudflare Workers to test queue functionality.

Resources

Summary

The OpenAPI tooling provides:

  1. Validation - Ensure spec quality (openapi:validate)
  2. Documentation - Generate beautiful docs (openapi:docs)
  3. Cloudflare Schema - Generate API Shield schema (schema:cloudflare)
  4. Postman Collection - Regenerate from spec (postman:collection)
  5. Contract Tests - Verify API compliance (test:contract)
  6. Queue Support - Async operations via Cloudflare Queues

Schema Hierarchy

docs/api/openapi.yaml                 ← canonical source of truth (edit this)
docs/api/cloudflare-schema.yaml       ← auto-generated (deno task schema:cloudflare)
docs/postman/postman-collection.json  ← auto-generated (deno task postman:collection)
docs/postman/postman-environment.json ← auto-generated (deno task postman:collection)

All tools are designed to work together in a continuous integration pipeline, ensuring your API stays consistent, well-documented, and reliable.

OpenAPI Quick Reference

Quick commands and workflows for working with the OpenAPI specification.

🚀 Quick Start

# Validate spec
deno task openapi:validate

# Generate docs
deno task openapi:docs

# Run contract tests
deno task test:contract

# View generated docs
open docs/api/index.html

📋 Common Tasks

Before Committing

# Validate OpenAPI spec
deno task openapi:validate

# Run all tests
deno task test

# Run contract tests
deno task test:contract

Before Deploying

# Full validation pipeline
deno task openapi:validate && \
deno task openapi:docs && \
deno task test:contract

# Deploy
deno task wrangler:deploy

Testing Specific Endpoints

# Test sync compilation
deno test --filter "Contract: POST /compile" worker/openapi-contract.test.ts --allow-read --allow-write --allow-net --allow-env

# Test async queue
deno test --filter "Contract: POST /compile/async" worker/openapi-contract.test.ts --allow-read --allow-write --allow-net --allow-env

# Test streaming
deno test --filter "Contract: POST /compile/stream" worker/openapi-contract.test.ts --allow-read --allow-write --allow-net --allow-env

🔄 Async Queue Operations

Key Concepts

Cloudflare Queues are used for:

  • Long-running compilations (>5 seconds)
  • Batch operations
  • Background processing
  • Rate limit avoidance

Queue Workflow

1. POST /compile/async → Returns 202 + requestId
2. Job processes in background
3. GET /queue/results/{requestId} → Returns results
4. GET /queue/stats → Monitor queue health

Testing Queues

# Test queue functionality
deno test --filter "Queue" worker/openapi-contract.test.ts --allow-read --allow-write --allow-net --allow-env

# Note: Local tests may return 500 (queue not configured)
# This is expected - queues work in production

Queue Configuration

In wrangler.toml:

[[queues.producers]]
queue = "adblock-compiler-queue"
binding = "ADBLOCK_COMPILER_QUEUE"

[[queues.producers]]
queue = "adblock-compiler-queue-high-priority"
binding = "ADBLOCK_COMPILER_QUEUE_HIGH_PRIORITY"

[[queues.consumers]]
queue = "adblock-compiler-queue"
max_batch_size = 10
max_batch_timeout = 30

📊 Response Codes

Success Codes

  • 200 - OK (sync operations)
  • 202 - Accepted (async operations queued)

Client Error Codes

  • 400 - Bad Request (invalid input, batch limit exceeded)
  • 404 - Not Found (queue result not found)
  • 429 - Rate Limited

Server Error Codes

  • 500 - Internal Error (validation failed, queue unavailable)

📝 Schema Validation

Request Validation

All requests are validated against OpenAPI schemas:

{
  "configuration": {
    "name": "Required string",
    "sources": [
      {
        "source": "Required string"
      }
    ]
  }
}

Response Validation

Contract tests verify:

  • ✅ Status codes match spec
  • ✅ Content-Type headers correct
  • ✅ Required fields present
  • ✅ Data types match
  • ✅ Custom headers (X-Cache, X-Request-Deduplication)

🧪 Postman Testing

# Regenerate collection from OpenAPI spec
deno task postman:collection

# Run all Postman tests
newman run docs/postman/postman-collection.json -e docs/postman/postman-environment.json

# Run specific folder
newman run docs/postman/postman-collection.json -e docs/postman/postman-environment.json --folder "Compilation"

# With detailed reporting
newman run docs/postman/postman-collection.json -e docs/postman/postman-environment.json --reporters cli,json,html

📈 Monitoring

Queue Metrics

# Get queue statistics
curl http://localhost:8787/queue/stats

# Response:
{
  "pending": 0,
  "completed": 42,
  "failed": 1,
  "cancelled": 0,
  "totalProcessingTime": 12500,
  "averageProcessingTime": 297,
  "processingRate": 8.4,
  "queueLag": 150
}

Performance Metrics

# Get API metrics
curl http://localhost:8787/metrics

# Response shows:
# - Request counts per endpoint
# - Success/failure rates
# - Average durations
# - Error types

🐛 Troubleshooting

Validation Errors

❌ Missing "operationId" for POST /compile

→ Add operationId to endpoint in docs/api/openapi.yaml

Contract Test Failures

❌ Expected status 200, got 500

→ Check server logs, verify request matches schema

Queue Always Returns 500

❌ Queue bindings are not available

→ Expected locally. Queues work in production with Cloudflare Workers

Documentation Won't Generate

❌ Failed to parse YAML

→ Run deno task openapi:validate to check syntax

📚 File Locations

docs/api/openapi.yaml                 # OpenAPI specification (canonical source — edit this)
docs/api/cloudflare-schema.yaml       # Auto-generated (deno task schema:cloudflare)
docs/postman/postman-collection.json  # Auto-generated (deno task postman:collection)
docs/postman/postman-environment.json # Auto-generated (deno task postman:collection)
scripts/validate-openapi.ts           # Validation script
scripts/generate-docs.ts              # Documentation generator
scripts/generate-postman-collection.ts # Postman generator
worker/openapi-contract.test.ts       # Contract tests
docs/api/index.html                   # Generated HTML docs
docs/api/README.md                    # Generated markdown docs
docs/api/OPENAPI_TOOLING.md           # Complete guide
docs/postman/README.md                # Postman collection guide
docs/testing/POSTMAN_TESTING.md       # Postman testing guide

💡 Tips

  1. Always validate before committing:

    deno task openapi:validate
    
  2. Test against local server first:

    deno task dev &
    sleep 3
    deno task test:contract
    
  3. Update docs when changing endpoints:

    # Edit docs/api/openapi.yaml
    deno task openapi:docs
    git add docs/api/
    
  4. Use queue for long operations:

    • Synchronous: POST /compile (< 5 seconds)
    • Asynchronous: POST /compile/async (> 5 seconds)
  5. Monitor queue health:

    watch -n 5 'curl -s http://localhost:8787/queue/stats | jq'
    

For detailed information, see OPENAPI_TOOLING.md

Streaming API Documentation

The adblock-compiler now provides comprehensive real-time event streaming capabilities through Server-Sent Events (SSE) and WebSocket connections, with enhanced diagnostic, cache, network, and performance metric events.

Overview

Enhanced Event Types

Both SSE and WebSocket endpoints now stream:

  1. Compilation Events: Source downloads, transformations, progress
  2. Diagnostic Events: Tracing system events with severity levels
  3. Cache Events: Cache hit/miss/write operations
  4. Network Events: HTTP requests with timing and size
  5. Performance Metrics: Download speeds, processing times, etc.

Server-Sent Events (SSE)

Endpoint

POST /compile/stream

Enhanced Event Types

Standard Compilation Events

  • log - Log messages with levels (info, warn, error, debug)
  • source:start - Source download started
  • source:complete - Source download completed
  • source:error - Source download failed
  • transformation:start - Transformation started
  • transformation:complete - Transformation completed with metrics
  • progress - Compilation progress updates
  • result - Final compilation result
  • done - Compilation finished
  • error - Compilation error

New Enhanced Events

  • diagnostic - Diagnostic events from tracing system
  • cache - Cache operations (hit/miss/write/evict)
  • network - Network operations (HTTP requests)
  • metric - Performance metrics

Example: Diagnostic Event

event: diagnostic
data: {
  "eventId": "evt-abc123",
  "timestamp": "2026-01-14T05:00:00Z",
  "category": "compilation",
  "severity": "info",
  "message": "Started source download",
  "correlationId": "comp-xyz789",
  "metadata": {
    "sourceName": "AdGuard DNS Filter",
    "sourceUrl": "https://..."
  }
}

Example: Cache Event

event: cache
data: {
  "eventId": "evt-cache-1",
  "category": "cache",
  "operation": "hit",
  "key": "cache:abc123xyz",
  "size": 51200
}

Example: Network Event

event: network
data: {
  "method": "GET",
  "url": "https://example.com/filters.txt",
  "statusCode": 200,
  "durationMs": 234,
  "responseSize": 51200
}

Example: Performance Metric

event: metric
data: {
  "metric": "download_speed",
  "value": 218.5,
  "unit": "KB/s",
  "dimensions": {
    "source": "AdGuard DNS Filter"
  }
}

WebSocket API

Endpoint

GET /ws/compile

WebSocket provides bidirectional communication for real-time compilation with cancellation support.

Features

  • ✅ Up to 3 concurrent compilations per connection
  • ✅ Real-time progress streaming with all event types
  • ✅ Cancellation support for running compilations
  • ✅ Automatic heartbeat (30s interval)
  • ✅ Connection timeout (5 minutes idle)
  • ✅ Session-based compilation tracking

Client → Server Messages

Compile Request

{
  "type": "compile",
  "sessionId": "my-session-1",
  "configuration": {
    "name": "My Filter List",
    "sources": [
      {
        "source": "https://example.com/filters.txt",
        "transformations": ["RemoveComments", "Validate"]
      }
    ],
    "transformations": ["Deduplicate"]
  },
  "benchmark": true
}

Cancel Request

{
  "type": "cancel",
  "sessionId": "my-session-1"
}

Ping (Heartbeat)

{
  "type": "ping"
}

Server → Client Messages

Welcome Message

{
  "type": "welcome",
  "version": "2.0.0",
  "connectionId": "ws-1737016800-abc123",
  "capabilities": {
    "maxConcurrentCompilations": 3,
    "supportsPauseResume": false,
    "supportsStreaming": true
  }
}

Compilation Started

{
  "type": "compile:started",
  "sessionId": "my-session-1",
  "configurationName": "My Filter List"
}

Event Message

All SSE-style events are wrapped in an event message:

{
  "type": "event",
  "sessionId": "my-session-1",
  "eventType": "diagnostic|cache|network|metric|source:start|...",
  "data": { /* event-specific data */ }
}

Compilation Complete

{
  "type": "compile:complete",
  "sessionId": "my-session-1",
  "rules": ["||ads.example.com^", "||tracking.example.com^"],
  "ruleCount": 2,
  "metrics": {
    "totalDurationMs": 1234,
    "sourceCount": 1,
    "ruleCount": 2
  },
  "compiledAt": "2026-01-14T05:00:00Z"
}

Error Messages

{
  "type": "compile:error",
  "sessionId": "my-session-1",
  "error": "Failed to fetch source",
  "details": {
    "stack": "..."
  }
}
{
  "type": "error",
  "error": "Maximum concurrent compilations reached",
  "code": "TOO_MANY_COMPILATIONS",
  "sessionId": "my-session-1"
}

JavaScript Client Examples

SSE Client

const eventSource = new EventSource('/compile/stream', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({
    configuration: {
      name: 'My List',
      sources: [{ source: 'https://example.com/filters.txt' }]
    }
  })
});

// Listen to all event types
['log', 'source:start', 'diagnostic', 'cache', 'network', 'metric', 'result', 'done'].forEach(event => {
  eventSource.addEventListener(event, (e) => {
    const data = JSON.parse(e.data);
    console.log(`[${event}]`, data);
  });
});

eventSource.addEventListener('error', (e) => {
  console.error('SSE Error:', e);
});

WebSocket Client

const ws = new WebSocket('ws://localhost:8787/ws/compile');

ws.onopen = () => {
  // Start compilation
  ws.send(JSON.stringify({
    type: 'compile',
    sessionId: 'session-' + Date.now(),
    configuration: {
      name: 'My Filter List',
      sources: [
        { source: 'https://example.com/filters.txt' }
      ],
      transformations: ['Deduplicate']
    },
    benchmark: true
  }));
};

ws.onmessage = (event) => {
  const message = JSON.parse(event.data);
  
  switch (message.type) {
    case 'welcome':
      console.log('Connected:', message.connectionId);
      break;
      
    case 'compile:started':
      console.log('Compilation started:', message.sessionId);
      break;
      
    case 'event':
      // Handle all event types
      console.log(`[${message.eventType}]`, message.data);
      if (message.eventType === 'diagnostic') {
        console.log('Diagnostic:', message.data.message);
      } else if (message.eventType === 'cache') {
        console.log('Cache operation:', message.data.operation);
      } else if (message.eventType === 'network') {
        console.log('Network request:', message.data.url, message.data.durationMs + 'ms');
      } else if (message.eventType === 'metric') {
        console.log('Metric:', message.data.metric, message.data.value, message.data.unit);
      }
      break;
      
    case 'compile:complete':
      console.log('Complete:', message.ruleCount, 'rules');
      console.log('Metrics:', message.metrics);
      break;
      
    case 'compile:error':
      console.error('Error:', message.error);
      break;
  }
};

// Cancel compilation after 5 seconds
setTimeout(() => {
  ws.send(JSON.stringify({
    type: 'cancel',
    sessionId: 'session-123'
  }));
}, 5000);

// Send heartbeat every 30 seconds
setInterval(() => {
  if (ws.readyState === WebSocket.OPEN) {
    ws.send(JSON.stringify({ type: 'ping' }));
  }
}, 30000);

Visual Testing

An interactive WebSocket test page is available:

http://localhost:8787/websocket-test.html

Features:

  • 🔗 Connection management
  • ⚙️ Compile request builder with quick configs
  • 📋 Real-time event log with color coding
  • 📊 Live statistics (events, sessions, rules)
  • 💻 Example code snippets

Event Categories

Diagnostic Events

{
  eventId: string;
  timestamp: string;
  category: 'compilation' | 'download' | 'transformation' | 'cache' | 'validation' | 'network' | 'performance' | 'error';
  severity: 'trace' | 'debug' | 'info' | 'warn' | 'error';
  message: string;
  correlationId?: string;
  metadata?: Record<string, unknown>;
}

Cache Events

{
  operation: 'hit' | 'miss' | 'write' | 'evict';
  key: string; // hashed for privacy
  size?: number; // bytes
}

Network Events

{
  method: string;
  url: string; // sanitized
  statusCode?: number;
  durationMs?: number;
  responseSize?: number; // bytes
}

Performance Metrics

{
  metric: string; // e.g., 'download_speed', 'parse_time'
  value: number;
  unit: string; // e.g., 'KB/s', 'ms', 'count'
  dimensions?: Record<string, string>; // for grouping
}

OpenAPI Specification

A comprehensive OpenAPI 3.0 specification is available at:

docs/api/openapi.yaml

This includes:

  • All REST endpoints
  • Complete request/response schemas
  • SSE event schemas
  • WebSocket protocol documentation
  • Security schemes
  • Example requests

Best Practices

SSE

  • ✅ Use for one-way streaming from server to client
  • ✅ Automatic reconnection built into browser EventSource
  • ✅ Simpler protocol, easier to debug
  • ❌ Cannot cancel running compilations
  • ❌ Limited to single compilation per connection

WebSocket

  • ✅ Use for bidirectional communication
  • ✅ Cancel running compilations
  • ✅ Multiple concurrent compilations per connection
  • ✅ Lower latency than SSE
  • ❌ More complex protocol
  • ❌ Requires manual reconnection logic

Performance

  • Monitor metric events for download speeds and processing times
  • Watch cache events to optimize cache hit rates
  • Track network events to identify slow sources
  • Use diagnostic events for debugging issues

Error Handling

SSE Errors

eventSource.addEventListener('error', (e) => {
  console.error('Connection lost, attempting to reconnect...');
  // EventSource automatically reconnects
});

WebSocket Errors

ws.onerror = (error) => {
  console.error('WebSocket error:', error);
};

ws.onclose = (event) => {
  if (!event.wasClean) {
    // Implement exponential backoff reconnection
    setTimeout(() => {
      connect(); // Your connection function
    }, 1000 * Math.pow(2, retryCount));
  }
};

Rate Limits

Both endpoints are subject to rate limiting:

  • 10 requests per minute per IP
  • Response: 429 Too Many Requests
  • Header: Retry-After: 60

WebSocket connections:

  • 3 concurrent compilations max per connection
  • 5 minute idle timeout
  • Heartbeat required every 30 seconds

See Also

Zod Validation Integration

This document describes the Zod schema validation system integrated into the adblock-compiler project.

Overview

The adblock-compiler uses Zod for runtime validation of configuration objects, API requests, and internal data structures. Zod provides:

  • Type-safe validation: Runtime validation with automatic TypeScript type inference
  • Composable schemas: Build complex schemas from simple building blocks
  • Detailed error messages: User-friendly validation error reporting
  • Zero dependencies: Lightweight and fast validation

Available Schemas

Configuration Schemas

SourceSchema

Validates individual source configurations in a filter list compilation.

import { SourceSchema } from '@jk-com/adblock-compiler';

const source = {
    source: 'https://example.com/filters.txt',
    name: 'Example Filters',
    type: 'adblock',
    exclusions: ['*ads*'],
    transformations: ['RemoveComments', 'Deduplicate'],
};

const result = SourceSchema.safeParse(source);
if (result.success) {
    console.log('Valid source:', result.data);
} else {
    console.error('Validation errors:', result.error);
}

Schema Definition:

  • source (string, required): URL (e.g. https://example.com/list.txt) or file path (/absolute/path or ./relative/path) to the filter list source. Plain strings that are neither a valid URL nor a recognized path are rejected.
  • name (string, optional): Human-readable name for the source
  • type (enum, optional): Source type - 'adblock' or 'hosts'
  • exclusions (string[], optional): List of rules or wildcards to exclude
  • exclusions_sources (string[], optional): List of files containing exclusions
  • inclusions (string[], optional): List of wildcards to include
  • inclusions_sources (string[], optional): List of files containing inclusions
  • transformations (TransformationType[], optional): List of transformations to apply

Normalization (.transform()):

SourceSchema automatically normalizes the parsed data:

  • source: leading and trailing whitespace is trimmed (whitespace-only values are rejected during validation)
  • name: leading and trailing whitespace is trimmed (if provided)

Transformation Ordering Refinement:

SourceSchema validates that if Compress is included in transformations, Deduplicate must also be present and must appear before Compress. This enforces correct ordering to prevent data loss.

// Valid: Deduplicate before Compress
{ transformations: ['Deduplicate', 'Compress'] }

// Invalid: Compress without Deduplicate
{ transformations: ['Compress'] }
// Error: "Deduplicate transformation is recommended before Compress. Add Deduplicate before Compress in transformations."

// Invalid: Compress before Deduplicate (wrong ordering)
{ transformations: ['Compress', 'Deduplicate'] }
// Error: "Deduplicate transformation is recommended before Compress. Add Deduplicate before Compress in transformations."

ConfigurationSchema

Validates the main compilation configuration object.

import { ConfigurationSchema } from '@jk-com/adblock-compiler';

const config = {
    name: 'My Custom Filter List',
    description: 'Blocks ads and trackers',
    homepage: 'https://example.com',
    license: 'GPL-3.0',
    version: '1.0.0',
    sources: [
        {
            source: 'https://example.com/filters.txt',
            name: 'Example Filters',
        },
    ],
    transformations: ['RemoveComments', 'Deduplicate', 'Compress'],
};

const result = ConfigurationSchema.safeParse(config);
if (result.success) {
    console.log('Valid configuration');
} else {
    console.error('Validation failed:', result.error.format());
}

Schema Definition:

  • name (string, required): Filter list name
  • description (string, optional): Filter list description
  • homepage (string, optional): Filter list homepage URL — validated as a URL (must start with http:// or https://)
  • license (string, optional): License identifier (e.g., 'GPL-3.0', 'MIT')
  • version (string, optional): Version string — must follow semver format (e.g. 1.0.0 or 1.0)
  • sources (ISource[], required): Array of source configurations (must not be empty)
  • Plus all fields from SourceSchema (exclusions, inclusions, transformations)

Transformation Ordering Refinement:

Same as SourceSchema — if Compress is in transformations, Deduplicate must also be present and must appear before Compress.

Worker Request Schemas

CompileRequestSchema

Validates compilation requests to the worker API.

import { CompileRequestSchema } from '@jk-com/adblock-compiler';

const request = {
    configuration: {
        name: 'My Filter List',
        sources: [{ source: 'https://example.com/filters.txt' }],
    },
    preFetchedContent: {
        'https://example.com/filters.txt': '||ads.example.com^\n||tracker.com^',
    },
    benchmark: true,
    priority: 'high',
    turnstileToken: 'token-xyz',
};

const result = CompileRequestSchema.safeParse(request);

Schema Definition:

  • configuration (IConfiguration, required): Configuration object (validated by ConfigurationSchema)
  • preFetchedContent (Record<string, string>, optional): Pre-fetched content map (source identifier → content). Keys may be URLs or arbitrary source identifiers.
  • benchmark (boolean, optional): Whether to collect benchmark metrics
  • priority (enum, optional): Request priority - 'standard' or 'high'
  • turnstileToken (string, optional): Cloudflare Turnstile verification token

BatchRequestSchema

Base schema for batch compilation requests.

import { BatchRequestSchema } from '@jk-com/adblock-compiler';

const batchRequest = {
    requests: [
        {
            id: 'request-1',
            configuration: { name: 'List 1', sources: [{ source: 'https://example.com/list1.txt' }] },
        },
        {
            id: 'request-2',
            configuration: { name: 'List 2', sources: [{ source: 'https://example.com/list2.txt' }] },
        },
    ],
    priority: 'standard',
};

const result = BatchRequestSchema.safeParse(batchRequest);

Schema Definition:

  • requests (array, required): Array of batch request items (must not be empty)
    • Each item contains:
      • id (string, required): Unique identifier for the request
      • configuration (IConfiguration, required): Configuration object
      • preFetchedContent (Record<string, string>, optional): Pre-fetched content
      • benchmark (boolean, optional): Whether to benchmark this request
  • priority (enum, optional): Batch priority - 'standard' or 'high'

Custom Refinement:

  • Validates that all request IDs are unique
  • Error message: "Duplicate request IDs are not allowed"

BatchRequestSyncSchema

Validates synchronous batch requests (limited to 10 items).

import { BatchRequestSyncSchema } from '@jk-com/adblock-compiler';

// Valid: 10 or fewer requests
const syncBatch = {
    requests: Array(10).fill(null).map((_, i) => ({
        id: `req-${i}`,
        configuration: { name: `List ${i}`, sources: [{ source: `https://example.com/list${i}.txt` }] },
    })),
};

const result = BatchRequestSyncSchema.safeParse(syncBatch);
// result.success === true

Limit: Maximum 10 requests Error Message: "Batch request limited to 10 requests maximum"

BatchRequestAsyncSchema

Validates asynchronous batch requests (limited to 100 items).

import { BatchRequestAsyncSchema } from '@jk-com/adblock-compiler';

// Valid: 100 or fewer requests
const asyncBatch = {
    requests: Array(50).fill(null).map((_, i) => ({
        id: `req-${i}`,
        configuration: { name: `List ${i}`, sources: [{ source: `https://example.com/list${i}.txt` }] },
    })),
};

const result = BatchRequestAsyncSchema.safeParse(asyncBatch);
// result.success === true

Limit: Maximum 100 requests Error Message: "Batch request limited to 100 requests maximum"

PrioritySchema

Validates the priority level for compilation requests. This schema is exported from @jk-com/adblock-compiler and re-used in worker/schemas.ts to avoid duplication.

import { PrioritySchema } from '@jk-com/adblock-compiler';

PrioritySchema.safeParse('standard'); // { success: true, data: 'standard' }
PrioritySchema.safeParse('high');     // { success: true, data: 'high' }
PrioritySchema.safeParse('low');      // { success: false }

Enum values: 'standard' | 'high'

The exported Priority type is inferred directly from this schema:

import type { Priority } from '@jk-com/adblock-compiler';
// type Priority = 'standard' | 'high'

Compilation Output Schemas

CompilationResultSchema

Validates the output of a compilation operation.

import { CompilationResultSchema } from '@jk-com/adblock-compiler';

const result = CompilationResultSchema.safeParse({
    rules: ['||ads.example.com^', '||tracker.com^'],
    ruleCount: 2,
});

Schema Definition:

  • rules (string[], required): Array of compiled filter rules
  • ruleCount (number, required): Non-negative integer count of rules

BenchmarkMetricsSchema

Validates compilation performance metrics returned when benchmark: true. Matches the CompilationMetrics interface from the compiler.

import { BenchmarkMetricsSchema } from '@jk-com/adblock-compiler';

Schema Definition:

  • totalDurationMs (number, required): Total compilation duration in milliseconds (non-negative)
  • stages (array, required): Per-stage benchmark results, each containing:
    • name (string, required): Stage name (e.g., 'fetch', 'transform')
    • durationMs (number, required): Stage duration in milliseconds (non-negative)
    • itemCount (number, optional): Number of items processed in this stage
    • itemsPerSecond (number, optional): Throughput: items processed per second
  • sourceCount (number, required): Number of sources processed (non-negative integer)
  • ruleCount (number, required): Total input rule count before transformations (non-negative integer)
  • outputRuleCount (number, required): Final output rule count after all transformations (non-negative integer)

WorkerCompilationResultSchema

Extends CompilationResultSchema with optional compilation metrics for worker responses. Matches the actual HTTP response shape returned by the Worker /compile endpoint.

import { WorkerCompilationResultSchema } from '@jk-com/adblock-compiler';

const result = WorkerCompilationResultSchema.safeParse({
    rules: ['||ads.example.com^'],
    ruleCount: 1,
    metrics: {
        totalDurationMs: 250,
        stages: [{ name: 'fetch', durationMs: 100 }, { name: 'transform', durationMs: 50 }],
        sourceCount: 1,
        ruleCount: 5,
        outputRuleCount: 1,
    },
});

Schema Definition:

  • All fields from CompilationResultSchema
  • metrics (BenchmarkMetrics, optional): Compilation performance metrics (present when benchmark: true)

CLI Schemas

CliArgumentsSchema

Validates parsed CLI arguments. Integrates with ArgumentParser.validate().

import { CliArgumentsSchema } from '@jk-com/adblock-compiler';

const args = CliArgumentsSchema.safeParse({
    config: 'myconfig.json',
    output: 'output.txt',
    verbose: true,
    noDeduplicate: true,
    exclude: ['*.cdn.example.com'],
    timeout: 10000,
});

General fields:

  • config (string, optional): Path to configuration file
  • input (string[], optional): Input source URLs or file paths
  • inputType (enum, optional): Input format — 'adblock' or 'hosts'
  • output (string, optional): Output file path
  • verbose (boolean, optional): Enable verbose logging
  • benchmark (boolean, optional): Enable benchmark reporting
  • useQueue (boolean, optional): Use async queue-based compilation
  • priority (enum, optional): Queue priority — 'standard' or 'high'
  • help (boolean, optional): Show help message
  • version (boolean, optional): Show version information

Output fields:

  • stdout (boolean, optional): Write output to stdout instead of a file
  • append (boolean, optional): Append to the output file instead of overwriting
  • format (string, optional): Output format
  • name (string, optional): Path to an existing file to compare output against
  • maxRules (number, optional, positive integer): Truncate output to at most this many rules

Transformation control fields:

  • noDeduplicate (boolean, optional): Skip the Deduplicate transformation
  • noValidate (boolean, optional): Skip the Validate transformation
  • noCompress (boolean, optional): Skip the Compress transformation
  • noComments (boolean, optional): Skip the RemoveComments transformation
  • invertAllow (boolean, optional): Apply the InvertAllow transformation
  • removeModifiers (boolean, optional): Apply the RemoveModifiers transformation
  • allowIp (boolean, optional): Replace Validate with ValidateAllowIp
  • convertToAscii (boolean, optional): Apply the ConvertToAscii transformation
  • transformation (TransformationType[], optional): Explicit transformation pipeline (overrides all other transformation flags). Values must be valid TransformationType enum members — invalid names are caught by Zod validation.

Filtering fields:

  • exclude (string[], optional): Exclusion rules or wildcard patterns
  • excludeFrom (string[], optional): Files containing exclusion rules
  • include (string[], optional): Inclusion rules or wildcard patterns
  • includeFrom (string[], optional): Files containing inclusion rules

Networking fields:

  • timeout (number, optional, positive integer): HTTP request timeout in milliseconds
  • retries (number, optional, non-negative integer): Number of HTTP retry attempts
  • userAgent (string, optional): Custom HTTP User-Agent header

Refinements:

  1. Either --input or --config must be specified (unless --help or --version)
  2. --output is required (unless --help, --version, or --stdout)
  3. Cannot specify both --config and --input simultaneously
  4. Cannot specify both --stdout and --output simultaneously

Environment Schema

EnvironmentSchema

Validates Cloudflare Worker environment bindings and runtime variables.

import { EnvironmentSchema } from '@jk-com/adblock-compiler';

const env = EnvironmentSchema.safeParse(workerEnv);

Schema Definition (all fields optional):

  • TURNSTILE_SECRET_KEY (string): Cloudflare Turnstile secret key
  • RATE_LIMIT_MAX_REQUESTS (number): Maximum requests per window (coerced from string)
  • RATE_LIMIT_WINDOW_MS (number): Rate limit window duration in milliseconds (coerced from string)
  • CACHE_TTL (number): Cache TTL in seconds (coerced from string)
  • LOG_LEVEL (enum): Log level — 'trace' | 'debug' | 'info' | 'warn' | 'error'

Additional worker bindings are allowed via .passthrough().

Filter Rule Schemas

AdblockRuleSchema

Validates the structure of a parsed adblock-syntax rule.

import { AdblockRuleSchema } from '@jk-com/adblock-compiler';

const rule = AdblockRuleSchema.safeParse({
    ruleText: '||ads.example.com^$important',
    pattern: 'ads.example.com',
    whitelist: false,
    options: [{ name: 'important', value: null }],
    hostname: 'ads.example.com',
});

Schema Definition:

  • ruleText (string, required, min 1): The raw rule text
  • pattern (string, required): The rule pattern
  • whitelist (boolean, required): Whether the rule is an allowlist rule
  • options (array | null, required): Array of { name: string, value: string | null } objects, or null
  • hostname (string | null, required): The target hostname, or null

EtcHostsRuleSchema

Validates the structure of a parsed /etc/hosts-syntax rule.

import { EtcHostsRuleSchema } from '@jk-com/adblock-compiler';

const rule = EtcHostsRuleSchema.safeParse({
    ruleText: '0.0.0.0 ads.example.com tracker.example.com',
    hostnames: ['ads.example.com', 'tracker.example.com'],
});

Schema Definition:

  • ruleText (string, required, min 1): The raw rule text
  • hostnames (string[], required, non-empty): Array of blocked hostnames

Using ConfigurationValidator

The ConfigurationValidator class provides a backward-compatible wrapper around Zod schemas.

import { ConfigurationValidator } from '@jk-com/adblock-compiler';

const validator = new ConfigurationValidator();

// Validate and get result
const result = validator.validate(configObject);
if (!result.valid) {
    console.error('Validation failed:', result.errorsText);
}

// Validate and throw on error
// Returns the Zod-parsed (and transformed) configuration object,
// e.g. with leading/trailing whitespace trimmed from string fields.
try {
    const validConfig = validator.validateAndGet(configObject);
    // Use validConfig safely — strings have been trimmed by SourceSchema's transform
} catch (error) {
    console.error('Invalid configuration:', error.message);
}

Type Inference

Zod schemas automatically infer TypeScript types:

import { z } from 'zod';
import { ConfigurationSchema } from '@jk-com/adblock-compiler';

// Infer the TypeScript type from the schema
type Configuration = z.infer<typeof ConfigurationSchema>;

// This type is equivalent to IConfiguration
const config: Configuration = {
    name: 'My List',
    sources: [{ source: 'https://example.com/list.txt' }],
};

Error Handling

Using safeParse()

The safeParse() method returns a result object that never throws:

const result = ConfigurationSchema.safeParse(data);

if (result.success) {
    // result.data contains the validated and typed data
    console.log('Valid configuration:', result.data);
} else {
    // result.error contains detailed validation errors
    console.error('Validation failed');
    
    // Get formatted errors
    const formatted = result.error.format();
    console.log('Formatted errors:', formatted);
    
    // Get flat list of errors
    const issues = result.error.issues;
    for (const issue of issues) {
        console.log(`Path: ${issue.path.join('.')}`);
        console.log(`Message: ${issue.message}`);
    }
}

Using parse()

The parse() method throws a ZodError if validation fails:

try {
    const validData = ConfigurationSchema.parse(data);
    // Use validData safely
} catch (error) {
    if (error instanceof z.ZodError) {
        console.error('Validation errors:', error.issues);
    }
}

Error Message Format

Validation errors include:

  • Path: Path to the invalid field (e.g., sources.0.source)
  • Message: Human-readable error description
  • Code: Error type code (e.g., invalid_type, too_small, custom)

Example error output:

sources.0.source: source is required and must be a non-empty string
sources: sources is required and must be a non-empty array
name: name is required and must be a non-empty string
transformations.2: Invalid enum value. Expected 'RemoveComments' | 'Compress' | ..., received 'InvalidTransformation'

Schema Composition

Zod schemas are composable, allowing you to build complex validation logic:

import { z } from 'zod';
import { ConfigurationSchema } from '@jk-com/adblock-compiler';

// Extend existing schema
const ExtendedConfigSchema = ConfigurationSchema.extend({
    customField: z.string().optional(),
    metadata: z.record(z.string(), z.unknown()).optional(),
});

// Partial schema (all fields optional)
const PartialConfigSchema = ConfigurationSchema.partial();

// Pick specific fields
const ConfigNameOnlySchema = ConfigurationSchema.pick({ name: true });

// Omit specific fields
const ConfigWithoutSourcesSchema = ConfigurationSchema.omit({ sources: true });

Best Practices

1. Always Use safeParse() for User Input

// Good: Handle validation errors gracefully
const result = ConfigurationSchema.safeParse(userInput);
if (!result.success) {
    return { error: result.error.format() };
}
return { data: result.data };

// Avoid: parse() throws and may crash your application
const data = ConfigurationSchema.parse(userInput); // Don't do this for user input

2. Validate Early

Validate data at system boundaries (API endpoints, file inputs):

// Validate immediately when receiving API request
app.post('/api/compile', async (req, res) => {
    const result = CompileRequestSchema.safeParse(req.body);
    
    if (!result.success) {
        return res.status(400).json({
            error: 'Invalid request',
            details: result.error.format(),
        });
    }
    
    // Now safely use result.data with full type safety
    const compiledOutput = await compiler.compile(result.data.configuration);
    res.json(compiledOutput);
});

3. Use Type Inference

Let Zod infer types instead of manually defining them:

import { z } from 'zod';
import { SourceSchema } from '@jk-com/adblock-compiler';

// Good: Type is automatically inferred and kept in sync
type Source = z.infer<typeof SourceSchema>;

// Avoid: Manual types can become out of sync with schema
interface Source {
    source: string;
    name?: string;
    // ... may forget to update when schema changes
}

4. Provide Custom Error Messages

Override default error messages for better UX:

const CustomSourceSchema = z.object({
    source: z.string()
        .min(1, 'Please provide a source URL')
        .url('Source must be a valid URL'),
    name: z.string()
        .min(1, 'Name cannot be empty')
        .max(100, 'Name must be 100 characters or less')
        .optional(),
});

5. Use .describe() for OpenAPI and Documentation

All exported schemas include .describe() annotations on their fields. These descriptions serve as machine-readable documentation and can be consumed by tools like zod-to-openapi to auto-generate OpenAPI specs:

import { SourceSchema } from '@jk-com/adblock-compiler';

// Access the description of the schema itself
// (available via the schema's internal _def.description or compatible OpenAPI tools)

// Example: integrate with zod-to-openapi
import { extendZodWithOpenApi } from '@asteasolutions/zod-to-openapi';
import { z } from 'zod';

extendZodWithOpenApi(z);

// Descriptions from .describe() annotations are automatically picked up
// when generating OpenAPI documentation from the schemas.

To add a description to your own derived schemas:

const CustomRequestSchema = z.object({
    source: z.string().url().describe('URL of the filter list to compile'),
    priority: PrioritySchema.optional().describe('Processing priority'),
});

6. Document Your Schemas

Add JSDoc comments to explain validation rules:

/**
 * Schema for custom filter configuration.
 * 
 * @example
 * ```typescript
 * const config = {
 *   source: 'https://example.com/list.txt',
 *   maxSize: 1000000, // 1MB max
 * };
 * 
 * const result = CustomSchema.safeParse(config);
 * ```
 */
export const CustomSchema = z.object({
    source: z.string().url(),
    maxSize: z.number().int().positive().max(10_000_000),
});

Integration Examples

Express/Hono API Validation

import { Hono } from 'hono';
import { CompileRequestSchema } from '@jk-com/adblock-compiler';

const app = new Hono();

app.post('/compile', async (c) => {
    const body = await c.req.json();
    const result = CompileRequestSchema.safeParse(body);
    
    if (!result.success) {
        return c.json({
            error: 'Validation failed',
            issues: result.error.issues,
        }, 400);
    }
    
    // Process validated request
    const compiled = await processCompilation(result.data);
    return c.json(compiled);
});

CLI Argument Validation

import { ConfigurationSchema } from '@jk-com/adblock-compiler';
import { readFileSync } from 'fs';

const configFile = process.argv[2];
const configJson = readFileSync(configFile, 'utf-8');
const configData = JSON.parse(configJson);

const result = ConfigurationSchema.safeParse(configData);
if (!result.success) {
    console.error('Invalid configuration file:');
    for (const issue of result.error.issues) {
        console.error(`  ${issue.path.join('.')}: ${issue.message}`);
    }
    process.exit(1);
}

console.log('Configuration is valid!');

File Upload Validation

import { SourceSchema } from '@jk-com/adblock-compiler';

async function validateUploadedSources(files: File[]) {
    const sources = [];
    
    for (const file of files) {
        const content = await file.text();
        const data = JSON.parse(content);
        
        const result = SourceSchema.safeParse(data);
        if (!result.success) {
            throw new Error(`Invalid source in ${file.name}: ${result.error.message}`);
        }
        
        sources.push(result.data);
    }
    
    return sources;
}

Advanced Usage

Custom Refinements

Add custom validation logic beyond basic type checking:

import { z } from 'zod';
import { ConfigurationSchema } from '@jk-com/adblock-compiler';

const StrictConfigSchema = ConfigurationSchema.refine(
    (config) => {
        // Ensure at least one source has a name
        return config.sources.some((s) => s.name);
    },
    {
        message: 'At least one source must have a name',
        path: ['sources'],
    },
);

Transform Data During Validation

Use .transform() to normalize or clean data:

const NormalizedSourceSchema = SourceSchema.transform((data) => ({
    ...data,
    source: data.source.trim(),
    name: data.name?.trim() || 'Unnamed Source',
}));

Union Types

Validate against multiple possible schemas:

const RequestSchema = z.union([
    CompileRequestSchema,
    z.object({ type: z.literal('batch'), batch: BatchRequestSchema }),
]);

Migration Guide

From Manual Validation to Zod

Before:

function validateConfig(config: unknown): IConfiguration {
    if (!config || typeof config !== 'object') {
        throw new Error('Configuration must be an object');
    }
    
    const cfg = config as any;
    
    if (!cfg.name || typeof cfg.name !== 'string') {
        throw new Error('name is required');
    }
    
    if (!Array.isArray(cfg.sources) || cfg.sources.length === 0) {
        throw new Error('sources is required and must be a non-empty array');
    }
    
    // ... many more checks
    
    return cfg as IConfiguration;
}

After:

import { ConfigurationSchema } from '@jk-com/adblock-compiler';

function validateConfig(config: unknown): IConfiguration {
    const result = ConfigurationSchema.safeParse(config);
    
    if (!result.success) {
        throw new Error(`Configuration validation failed:\n${result.error.message}`);
    }
    
    return result.data;
}

Performance Considerations

Zod validation is fast, but consider these optimizations for high-throughput scenarios:

  1. Reuse schema instances: Don't recreate schemas on every validation
  2. Use .parse() carefully: Only in trusted contexts where you want to throw on error
  3. Consider lazy validation: Use z.lazy() for recursive schemas
  4. Profile your validation: Use benchmarks to identify bottlenecks
// Good: Reuse schema
const schema = ConfigurationSchema;
for (const config of configs) {
    schema.safeParse(config);
}

// Avoid: Recreating schema each time
for (const config of configs) {
    z.object({ /* ... */ }).safeParse(config); // Don't do this
}

Testing Schemas

Always test your schemas with both valid and invalid data:

import { assertEquals } from '@std/assert';
import { ConfigurationSchema } from '@jk-com/adblock-compiler';

Deno.test('ConfigurationSchema validates correct data', () => {
    const validConfig = {
        name: 'Test List',
        sources: [{ source: 'https://example.com/list.txt' }],
    };
    
    const result = ConfigurationSchema.safeParse(validConfig);
    assertEquals(result.success, true);
});

Deno.test('ConfigurationSchema rejects missing name', () => {
    const invalidConfig = {
        sources: [{ source: 'https://example.com/list.txt' }],
    };
    
    const result = ConfigurationSchema.safeParse(invalidConfig);
    assertEquals(result.success, false);
    if (!result.success) {
        assertEquals(result.error.issues.some((i) => i.path.includes('name')), true);
    }
});

Resources

Cloudflare Worker Documentation

Documentation for Cloudflare-specific features, services, and integrations.

Contents

Cloudflare Services Integration

This document describes all Cloudflare services integrated into the adblock-compiler project, their current status, and configuration guidance.


Service Status Overview

ServiceStatusBindingPurpose
KV Namespaces✅ ActiveCOMPILATION_CACHE, RATE_LIMIT, METRICSCaching, rate limiting, metrics aggregation
R2 Storage✅ ActiveFILTER_STORAGEFilter list storage and artifact persistence
D1 Database✅ ActiveDBCompilation history, deployment records
Queues✅ ActiveADBLOCK_COMPILER_QUEUE, ADBLOCK_COMPILER_QUEUE_HIGH_PRIORITYAsync compilation, batch processing
Analytics Engine✅ ActiveANALYTICS_ENGINERequest metrics, cache analytics, workflow tracking
Workflows✅ ActiveCOMPILATION_WORKFLOW, BATCH_COMPILATION_WORKFLOW, CACHE_WARMING_WORKFLOW, HEALTH_MONITORING_WORKFLOWDurable async execution
Hyperdrive✅ ActiveHYPERDRIVEAccelerated PostgreSQL (PlanetScale) connectivity
Tail Worker✅ Activeadblock-compiler-tailLog collection, error forwarding
SSE Streaming✅ ActiveReal-time compilation progress via /compile/stream
WebSocket✅ ActiveReal-time bidirectional compile via /ws/compile
Observability✅ ActiveBuilt-in logs and traces via [observability]
Cron Triggers✅ ActiveCache warming (every 6h), health monitoring (every 1h)
Pipelines✅ ConfiguredMETRICS_PIPELINEMetrics/audit event ingestion → R2
Log Sink (HTTP)✅ ConfiguredLOG_SINK_URL (env var)Tail worker forwards to external log service
API Shield📋 DashboardOpenAPI schema validation at edge (see below)
Containers🔧 ConfiguredADBLOCK_COMPILERDurable Object container (production only)

Cloudflare Pipelines

Pipelines provide scalable, batched HTTP event ingestion — ideal for routing metrics and audit events to R2 or downstream analytics.

Setup

# Create the pipeline (routes to R2)
wrangler pipelines create adblock-compiler-metrics-pipeline \
  --r2-bucket adblock-compiler-r2-storage \
  --batch-max-mb 10 \
  --batch-timeout-secs 30

Usage

The PipelineService (src/services/PipelineService.ts) provides a type-safe wrapper:

import { PipelineService } from '../src/services/PipelineService.ts';

const pipeline = new PipelineService(env.METRICS_PIPELINE, logger);

await pipeline.send({
    type: 'compilation_success',
    requestId: 'req-123',
    durationMs: 250,
    ruleCount: 12000,
    sourceCount: 5,
});

Configuration

The binding is defined in wrangler.toml:

[[pipelines]]
binding = "METRICS_PIPELINE"
pipeline = "adblock-compiler-metrics-pipeline"

Log Sinks (Tail Worker)

The tail worker (worker/tail.ts) can forward structured logs to any HTTP log ingestion endpoint (Better Stack, Grafana Loki, Logtail, etc.).

Configuration

Set these secrets/environment variables:

wrangler secret put LOG_SINK_URL       # e.g. https://in.logs.betterstack.com
wrangler secret put LOG_SINK_TOKEN     # Bearer token for the log sink

Optional env var (defaults to warn):

wrangler secret put LOG_SINK_MIN_LEVEL  # debug | info | warn | error

Supported Log Sinks

ServiceLOG_SINK_URLAuth
Better Stackhttps://in.logs.betterstack.comBearer token
Logtailhttps://in.logtail.comBearer token
Grafana Lokihttps://<host>/loki/api/v1/pushBearer token
Custom HTTPAny HTTPS endpointBearer token (optional)

API Shield

Cloudflare API Shield enforces OpenAPI schema validation at the edge for all requests to /compile, /compile/stream, and /compile/batch. This is configured in the Cloudflare dashboard — no code changes are required.

Setup

  1. Go to Cloudflare Dashboard → Security → API Shield
  2. Click Add Schema and upload docs/api/cloudflare-schema.yaml
  3. Set Mitigation action to Block for schema violations
  4. Enable for endpoints:
    • POST /compile
    • POST /compile/stream
    • POST /compile/batch

Schema Location

The OpenAPI schema is at docs/api/cloudflare-schema.yaml (auto-generated by deno task schema:cloudflare).


Analytics Engine

The Analytics Engine tracks all key events through src/services/AnalyticsService.ts. Data is queryable via the Cloudflare Workers Analytics API.

Tracked Events

EventDescription
compilation_requestEvery incoming compile request
compilation_successSuccessful compilation with timing and rule count
compilation_errorFailed compilation with error type
cache_hit / cache_missKV cache effectiveness
rate_limit_exceededRate limit hits by IP
workflow_started / completed / failedWorkflow lifecycle
batch_compilationBatch compile job metrics
api_requestAll API endpoint calls

Querying

-- Average compilation time over last 24h
SELECT
  avg(double1) as avg_duration_ms,
  sum(double2) as total_rules
FROM adguard-compiler-analytics-engine
WHERE timestamp > NOW() - INTERVAL '1' DAY
  AND blob1 = 'compilation_success'

D1 Database

D1 stores compilation history and deployment records, enabling the admin dashboard to show historical data.

Schema

Migrations are in migrations/. Apply with:

wrangler d1 execute adblock-compiler-d1-database --file=migrations/0001_init.sql --remote
wrangler d1 execute adblock-compiler-d1-database --file=migrations/0002_deployment_history.sql --remote

Workflows

Four durable workflows handle crash-resistant async operations:

WorkflowTriggerPurpose
CompilationWorkflow/compile/asyncSingle async compilation with retry
BatchCompilationWorkflow/compile/batchPer-item recovery for batch jobs
CacheWarmingWorkflowCron (every 6h)Pre-populate KV cache
HealthMonitoringWorkflowCron (every 1h)Check source URL health

References

Admin Dashboard

The Adblock Compiler Admin Dashboard is the main landing page that provides a centralized control panel for managing, testing, and monitoring the filter list compilation service.

Overview

The dashboard is accessible at the root URL (/) and provides:

  • Real-time metrics - Monitor compilation requests, queue depth, cache performance, and response times
  • Navigation hub - Quick access to all tools and test pages
  • Notification system - Browser notifications for async compilation jobs
  • Queue visualization - Chart.js-powered queue depth tracking
  • Quick actions - Common administrative tasks

Features

📊 Real-time Metrics

The dashboard displays four key metrics that update automatically:

  1. Total Requests - Cumulative API requests processed
  2. Queue Depth - Current number of pending compilation jobs
  3. Cache Hit Rate - Percentage of requests served from cache
  4. Avg Response Time - Average compilation response time in milliseconds

Metrics refresh automatically every 30 seconds and can be manually refreshed using the "Refresh" button.

🚀 Main Tools

Quick navigation cards to primary tools:

  • Filter List Compiler (/compiler.html) - Interactive UI for compiling filter lists with real-time progress
  • API Test Suite (/test.html) - Test API endpoints with various configurations
  • E2E Integration Tests (/e2e-tests.html) - End-to-end testing of all compiler features

⚡ Real-time & Performance

Advanced features and demonstrations:

WebSocket Demo (/websocket-test.html)

WebSocket endpoint demonstration showing bidirectional real-time compilation.

Use WebSocket when:

  • You need full-duplex communication
  • Lower latency is critical
  • You want to send data both ways (client → server, server → client)
  • Building interactive applications requiring instant feedback

Benefits over other approaches:

  • Lower latency than Server-Sent Events (SSE)
  • True bidirectional communication
  • Better for real-time interactive applications
  • Connection stays open for multiple operations

Benchmarks

Access to performance benchmarks for:

  • String utilities performance
  • Wildcard matching speed
  • Rule parsing efficiency
  • Transformation throughput

Run benchmarks via CLI:

deno task bench                      # All benchmarks
deno task bench:utils                # String & utility benchmarks
deno task bench:transformations      # Transformation benchmarks

Endpoint Comparison

Understanding when to use each compilation endpoint:

EndpointTypeUse Case
POST /compileJSONSimple compilation with immediate JSON response
POST /compile/streamSSEServer-Sent Events for one-way progress updates
GET /ws/compileWebSocketBidirectional real-time with interactive feedback
POST /compile/asyncQueueBackground processing for long-running jobs

Choose:

  • JSON - Simple, fire-and-forget compilations
  • SSE - Progress tracking with unidirectional updates
  • WebSocket - Interactive applications needing bidirectional communication
  • Queue - Background jobs that don't need immediate results

🔔 Notification System

The dashboard includes a browser notification system for tracking async compilation jobs.

Features

  • Browser notifications - Native OS notifications when jobs complete
  • In-page toasts - Visual notifications within the dashboard
  • Job tracking - Automatic monitoring of queued compilation jobs
  • Persistent state - Notifications work across page refreshes

How to Enable

  1. Click the notification toggle in the dashboard
  2. Allow browser notifications when prompted
  3. Tracked async jobs will trigger notifications upon completion

Notification Types

  • Success (Green) - Job completed successfully
  • Error (Red) - Job failed with error
  • Warning (Yellow) - Important information
  • Info (Blue) - General updates

Notifications appear in two forms:

  1. Browser/OS notifications - Native system notifications (when enabled)
  2. In-page toasts - Slide-in notifications in the top-right corner

Tracking Async Jobs

When you submit an async compilation job (via /compile/async or /compile/batch/async), the dashboard:

  1. Stores the requestId in local storage
  2. Polls queue stats every 10 seconds
  3. Detects when the job completes
  4. Shows both browser and in-page notifications
  5. Displays completion time and configuration name

Jobs are automatically cleaned up 1 hour after creation.

📈 Queue Monitoring

Real-time visualization of queue depth over time using Chart.js:

  • Line chart showing queue depth history
  • Last 20 data points displayed
  • Auto-updates every 30 seconds
  • Responsive design

⚡ Quick Actions

One-click access to common tasks:

  • API Docs - View full API documentation
  • View Metrics - Raw metrics JSON endpoint
  • Queue Stats - Detailed queue statistics
  • Clear Cache - Cache management (admin only)

File Structure Changes

The admin dashboard is part of a reorganization of the public files:

Before:

public/
  index.html          # Compiler UI
  test.html
  e2e-tests.html
  websocket-test.html

After:

public/
  index.html          # Admin Dashboard (NEW - landing page)
  compiler.html       # Compiler UI (renamed from index.html)
  test.html
  e2e-tests.html
  websocket-test.html

Auto-refresh

The dashboard automatically refreshes data every 30 seconds:

  • Metrics (requests, cache, response time)
  • Queue statistics and depth
  • Queue depth chart updates
  • Async job monitoring (every 10 seconds)

Manual refresh is available via the "Refresh" button in the queue chart section.

API Endpoints Used

The dashboard makes calls to the following endpoints:

  • GET /metrics - Performance and request metrics
  • GET /queue/stats - Queue depth, history, and job status
  • GET /queue/history - Historical queue depth data

Browser Compatibility

The dashboard uses modern web features:

  • Chart.js 4.4.1 - For queue visualization
  • Notification API - For browser notifications (optional)
  • LocalStorage - For persistent settings and job tracking
  • Fetch API - For API calls
  • CSS Grid & Flexbox - For responsive layout

Supported browsers:

  • Chrome/Edge 90+
  • Firefox 88+
  • Safari 14+

Customization

Theme Colors

CSS custom properties (defined in :root):

--primary: #667eea;
--secondary: #764ba2;
--success: #10b981;
--danger: #ef4444;
--warning: #f59e0b;
--info: #3b82f6;

Refresh Intervals

To adjust auto-refresh timing, modify the JavaScript:

// Auto-refresh metrics (default: 30 seconds)
setInterval(refreshMetrics, 30000);

// Monitor async jobs (default: 10 seconds)
setInterval(async () => { /* ... */ }, 10000);

Security

  • Rate limiting - Applied to compilation endpoints
  • CORS - Configured for cross-origin access
  • Turnstile - Optional bot protection
  • No sensitive data - Dashboard displays public metrics only

Performance

  • Lazy loading - Charts initialized only when needed
  • Debounced updates - Prevents excessive re-renders
  • Efficient polling - Only fetches data when tracking jobs
  • LocalStorage cleanup - Removes old tracked jobs automatically

Accessibility

  • Semantic HTML structure
  • ARIA labels where appropriate
  • Keyboard navigation support
  • Responsive design for mobile devices
  • High contrast colors for readability

Future Enhancements

Potential additions to the dashboard:

  • Dark mode toggle
  • Customizable refresh intervals
  • Historical metrics graphs
  • Job scheduling interface
  • Real-time WebSocket connection status
  • Filter list library management
  • User authentication for admin features

Cloudflare Analytics Engine Integration

This document describes the Analytics Engine integration for tracking metrics and telemetry data in the adblock-compiler worker.

Overview

Cloudflare Analytics Engine provides high-cardinality, real-time analytics with SQL-like querying capabilities. The adblock-compiler uses Analytics Engine to track:

  • API request metrics
  • Compilation success/failure rates
  • Cache hit/miss ratios
  • Rate limiting events
  • Workflow execution metrics
  • Source fetch performance

Configuration

wrangler.toml Setup

The Analytics Engine binding is already configured in wrangler.toml:

[[analytics_engine_datasets]]
binding = "ANALYTICS_ENGINE"
dataset = "adguard-compiler-analytics-engine"

Environment Binding

The Env interface in worker/worker.ts includes the optional Analytics Engine binding:

interface Env {
    // ... other bindings
    ANALYTICS_ENGINE?: AnalyticsEngineDataset;
}

The binding is optional, allowing the worker to function without Analytics Engine configured (e.g., in development).

AnalyticsService

The AnalyticsService class (src/services/AnalyticsService.ts) provides a typed interface for tracking events.

Event Types

Event TypeDescription
compilation_requestA compilation request was received
compilation_successCompilation completed successfully
compilation_errorCompilation failed with an error
cache_hitResult served from cache
cache_missCache miss, compilation required
rate_limit_exceededClient exceeded rate limit
source_fetchExternal source fetch completed
workflow_startedWorkflow execution started
workflow_completedWorkflow completed successfully
workflow_failedWorkflow failed with an error
api_requestGeneric API request tracking

Data Model

Analytics Engine data points consist of:

  • Index (1): Event type for efficient filtering
  • Doubles (up to 20): Numeric metrics
  • Blobs (up to 20): String metadata

Usage Example

import { AnalyticsService } from '../src/services/AnalyticsService.ts';

// Create service instance
const analytics = new AnalyticsService(env.ANALYTICS_ENGINE);

// Track a compilation request
analytics.trackCompilationRequest({
    requestId: 'req-123',
    configName: 'EasyList',
    sourceCount: 3,
});

// Track success with metrics
analytics.trackCompilationSuccess({
    requestId: 'req-123',
    configName: 'EasyList',
    sourceCount: 3,
    ruleCount: 50000,
    durationMs: 1234,
    cacheKey: 'cache:abc123',
});

// Track errors
analytics.trackCompilationError({
    requestId: 'req-123',
    configName: 'EasyList',
    sourceCount: 3,
    durationMs: 500,
    error: 'Source fetch failed',
});

Utility Methods

// Hash IP addresses for privacy
const ipHash = AnalyticsService.hashIp('192.168.1.1');

// Categorize user agents
const category = AnalyticsService.categorizeUserAgent(userAgent);
// Returns: 'adguard', 'ublock', 'browser', 'curl', 'bot', 'library', 'unknown'

Tracked Locations

Analytics tracking is integrated into:

Worker Endpoints (worker/worker.ts)

  • Rate limiting: Tracks when clients exceed rate limits
  • Cache hits/misses: Tracks cache performance on /compile/json
  • Compilation requests: Tracks all compilation attempts
  • Compilation results: Tracks success/failure with metrics

Workflows

All workflows track execution metrics:

WorkflowEvents Tracked
CompilationWorkflowstarted, completed, failed
BatchCompilationWorkflowstarted, completed, failed
CacheWarmingWorkflowstarted, completed, failed
HealthMonitoringWorkflowstarted, completed, failed

Querying Analytics Data

Use the Cloudflare dashboard or GraphQL API to query analytics:

Dashboard

  1. Go to Cloudflare Dashboard > Analytics & Logs > Analytics Engine
  2. Select the adguard-compiler-analytics-engine dataset
  3. Use SQL queries to analyze data

Example Queries

-- Compilation success rate over last 24 hours
SELECT
    blob1 as event_type,
    COUNT(*) as count
FROM adguard-compiler-analytics-engine
WHERE timestamp > NOW() - INTERVAL '24' HOUR
    AND blob1 IN ('compilation_success', 'compilation_error')
GROUP BY blob1

-- Average compilation duration by config
SELECT
    blob2 as config_name,
    AVG(double1) as avg_duration_ms,
    COUNT(*) as total_compilations
FROM adguard-compiler-analytics-engine
WHERE timestamp > NOW() - INTERVAL '7' DAY
    AND blob1 = 'compilation_success'
GROUP BY blob2
ORDER BY total_compilations DESC

-- Cache hit ratio
SELECT
    SUM(CASE WHEN blob1 = 'cache_hit' THEN 1 ELSE 0 END) as hits,
    SUM(CASE WHEN blob1 = 'cache_miss' THEN 1 ELSE 0 END) as misses,
    SUM(CASE WHEN blob1 = 'cache_hit' THEN 1 ELSE 0 END) * 100.0 /
        COUNT(*) as hit_rate_percent
FROM adguard-compiler-analytics-engine
WHERE timestamp > NOW() - INTERVAL '24' HOUR
    AND blob1 IN ('cache_hit', 'cache_miss')

-- Rate limit events by IP hash
SELECT
    blob3 as ip_hash,
    COUNT(*) as limit_events
FROM adguard-compiler-analytics-engine
WHERE timestamp > NOW() - INTERVAL '1' HOUR
    AND blob1 = 'rate_limit_exceeded'
GROUP BY blob3
ORDER BY limit_events DESC
LIMIT 10

Graceful Degradation

The AnalyticsService gracefully handles missing Analytics Engine:

constructor(dataset?: AnalyticsEngineDataset) {
    this.dataset = dataset;
    this.enabled = !!dataset;
}

private writeDataPoint(event: AnalyticsEventData): void {
    if (!this.enabled || !this.dataset) {
        return; // Silently skip when not configured
    }
    // ... write data point
}

This ensures:

  • Local development works without Analytics Engine
  • No errors if binding is missing
  • Easy toggle for analytics collection

Data Retention

Analytics Engine data is retained according to your Cloudflare plan:

  • Free: 31 days
  • Pro: 90 days
  • Business: 1 year
  • Enterprise: Custom

Privacy Considerations

The implementation includes privacy-conscious practices:

  1. IP Hashing: Client IPs are hashed before storage
  2. No PII: No personal identifiable information is stored
  3. User Agent Categorization: User agents are categorized rather than stored raw
  4. Request ID Tracking: Uses generated request IDs rather than user identifiers

Extending Analytics

To add new event tracking:

  1. Add a new event type to AnalyticsEventType:
export type AnalyticsEventType =
    | 'compilation_request'
    // ... existing types
    | 'your_new_event';
  1. Create a data interface if needed:
export interface YourEventData {
    requestId: string;
    // ... fields
}
  1. Add a tracking method to AnalyticsService:
public trackYourEvent(data: YourEventData): void {
    this.writeDataPoint({
        eventType: 'your_new_event',
        timestamp: Date.now(),
        doubles: [data.someNumber],
        blobs: [data.requestId, data.someString],
    });
}
  1. Call the tracking method where appropriate in the codebase.

Troubleshooting

Analytics Not Recording

  1. Verify the binding exists in wrangler.toml
  2. Check the dataset name matches
  3. Ensure ANALYTICS_ENGINE is in your Env interface
  4. Check Cloudflare dashboard for the dataset

Query Returns No Results

  1. Verify the time range includes recent data
  2. Check event type names match exactly
  3. Ensure data is being written (check worker logs)

High Cardinality Warnings

If you see cardinality warnings:

  1. Avoid using raw IPs or unique identifiers in indexes
  2. Use categorical values in blob fields
  3. Consider aggregating data before writing

Cloudflare D1 Integration Guide

Complete guide for using Prisma with Cloudflare D1 in the adblock-compiler project.

Overview

Cloudflare D1 is a serverless SQLite database that runs at the edge, offering:

  • Global distribution - Data replicated across Cloudflare's edge network
  • SQLite compatibility - Familiar SQL syntax and tooling
  • Serverless - No infrastructure management
  • Low latency - Edge-first architecture
  • Cost effective - Pay-per-use pricing model

Prerequisites

  • Cloudflare account with Workers enabled
  • Wrangler CLI installed (npm install -g wrangler)
  • Node.js 18+ or Deno

Quick Start

1. Install Dependencies

npm install @prisma/client @prisma/adapter-d1
npm install -D prisma wrangler

2. Create D1 Database

# Login to Cloudflare
wrangler login

# Create a new D1 database
wrangler d1 create adblock-storage

# Note the database_id from the output

3. Configure wrangler.toml

Create or update wrangler.toml in your project root:

name = "adblock-compiler"
main = "src/worker.ts"
compatibility_date = "2024-01-01"

[[d1_databases]]
binding = "DB"
database_name = "adblock-storage"
database_id = "YOUR_DATABASE_ID_HERE"

4. Create D1 Prisma Schema

Create prisma/schema.d1.prisma:

generator client {
  provider        = "prisma-client-js"
  previewFeatures = ["driverAdapters"]
}

datasource db {
  provider = "sqlite"
  url      = "file:./dev.db"
}

model StorageEntry {
  id        String   @id @default(cuid())
  key       String   @unique
  data      String
  createdAt DateTime @default(now())
  updatedAt DateTime @updatedAt
  expiresAt DateTime?
  tags      String?

  @@index([key])
  @@index([expiresAt])
  @@map("storage_entries")
}

model FilterCache {
  id        String   @id @default(cuid())
  source    String   @unique
  content   String
  hash      String
  etag      String?
  createdAt DateTime @default(now())
  updatedAt DateTime @updatedAt
  expiresAt DateTime?

  @@index([source])
  @@index([expiresAt])
  @@map("filter_cache")
}

model CompilationMetadata {
  id          String   @id @default(cuid())
  configName  String
  timestamp   DateTime @default(now())
  sourceCount Int
  ruleCount   Int
  duration    Int
  outputPath  String?

  @@index([configName])
  @@index([timestamp])
  @@map("compilation_metadata")
}

model SourceSnapshot {
  id          String   @id @default(cuid())
  source      String
  timestamp   DateTime @default(now())
  contentHash String
  ruleCount   Int
  ruleSample  String?
  etag        String?
  isCurrent   Int      @default(1)

  @@unique([source, isCurrent])
  @@index([source])
  @@index([timestamp])
  @@map("source_snapshots")
}

model SourceHealth {
  id                  String   @id @default(cuid())
  source              String   @unique
  status              String
  totalAttempts       Int      @default(0)
  successfulAttempts  Int      @default(0)
  failedAttempts      Int      @default(0)
  consecutiveFailures Int      @default(0)
  averageDuration     Float    @default(0)
  averageRuleCount    Float    @default(0)
  lastAttemptAt       DateTime?
  lastSuccessAt       DateTime?
  lastFailureAt       DateTime?
  recentAttempts      String?
  updatedAt           DateTime @updatedAt

  @@index([source])
  @@index([status])
  @@map("source_health")
}

model SourceAttempt {
  id        String   @id @default(cuid())
  source    String
  timestamp DateTime @default(now())
  success   Int      @default(0)
  duration  Int
  error     String?
  ruleCount Int?
  etag      String?

  @@index([source])
  @@index([timestamp])
  @@map("source_attempts")
}

5. Generate Prisma Client

# Generate with D1 schema
npx prisma generate --schema=prisma/schema.d1.prisma

6. Create Database Migrations

# Generate SQL migration
npx prisma migrate diff \
  --from-empty \
  --to-schema-datamodel prisma/schema.d1.prisma \
  --script > migrations/0001_init.sql

# Apply to local D1
wrangler d1 execute adblock-storage --local --file=migrations/0001_init.sql

# Apply to remote D1
wrangler d1 execute adblock-storage --file=migrations/0001_init.sql

7. Create D1 Storage Adapter

See src/storage/D1StorageAdapter.ts for the complete implementation.

Usage in Cloudflare Workers

Worker Entry Point

// src/worker.ts
import { PrismaClient } from '@prisma/client';
import { PrismaD1 } from '@prisma/adapter-d1';
import { D1StorageAdapter } from './storage/D1StorageAdapter';

export interface Env {
    DB: D1Database;
}

export default {
    async fetch(request: Request, env: Env): Promise<Response> {
        // Create Prisma client with D1 adapter
        const adapter = new PrismaD1(env.DB);
        const prisma = new PrismaClient({ adapter });

        // Create storage adapter
        const storage = new D1StorageAdapter(prisma);

        // Example: Cache a filter list
        await storage.cacheFilterList(
            'https://example.com/filters.txt',
            ['||ad.example.com^'],
            'hash123',
        );

        // Example: Get cached filter
        const cached = await storage.getCachedFilterList('https://example.com/filters.txt');

        return new Response(
            JSON.stringify({
                cached: cached !== null,
                ruleCount: cached?.content.length || 0,
            }),
            {
                headers: { 'Content-Type': 'application/json' },
            },
        );
    },
};

Type Definitions

// src/types/env.d.ts
interface Env {
    DB: D1Database;
    CACHE_TTL?: string;
    DEBUG?: string;
}

D1 Storage Adapter API

The D1 adapter implements the same IStorageAdapter interface:

interface ID1StorageAdapter {
    // Core operations
    set<T>(key: string[], value: T, ttlMs?: number): Promise<boolean>;
    get<T>(key: string[]): Promise<StorageEntry<T> | null>;
    delete(key: string[]): Promise<boolean>;
    list<T>(options?: QueryOptions): Promise<Array<{ key: string[]; value: StorageEntry<T> }>>;

    // Filter caching
    cacheFilterList(source: string, content: string[], hash: string, etag?: string, ttlMs?: number): Promise<boolean>;
    getCachedFilterList(source: string): Promise<CacheEntry | null>;

    // Metadata
    storeCompilationMetadata(metadata: CompilationMetadata): Promise<boolean>;
    getCompilationHistory(configName: string, limit?: number): Promise<CompilationMetadata[]>;

    // Maintenance
    clearExpired(): Promise<number>;
    clearCache(): Promise<number>;
    getStats(): Promise<StorageStats>;
}

Local Development

Using Wrangler Dev

# Start local development server
wrangler dev

# With local D1 database
wrangler dev --local --persist

Local D1 Testing

# Execute SQL on local D1
wrangler d1 execute adblock-storage --local --command="SELECT * FROM storage_entries"

# Export local database
wrangler d1 export adblock-storage --local --output=backup.sql

Migration from Prisma/SQLite

Export Data from SQLite

// scripts/export-from-sqlite.ts
import { PrismaStorageAdapter } from './src/storage/PrismaStorageAdapter.ts';

const storage = new PrismaStorageAdapter(logger, { type: 'prisma' });
await storage.open();

const entries = await storage.list({ prefix: [] });
const exportData = entries.map((e) => ({
    key: e.key.join('/'),
    data: JSON.stringify(e.value.data),
    createdAt: e.value.createdAt,
    expiresAt: e.value.expiresAt,
}));

await Deno.writeTextFile('export.json', JSON.stringify(exportData, null, 2));

Import to D1

// scripts/import-to-d1.ts
const data = JSON.parse(await Deno.readTextFile('export.json'));

for (const entry of data) {
    await env.DB.prepare(`
    INSERT INTO storage_entries (id, key, data, createdAt, expiresAt)
    VALUES (?, ?, ?, ?, ?)
  `).bind(
            crypto.randomUUID(),
            entry.key,
            entry.data,
            entry.createdAt,
            entry.expiresAt,
        ).run();
}

Performance Optimization

Indexing Strategy

The schema includes indexes on:

  • key - Primary lookup
  • source - Filter cache queries
  • configName - Compilation history
  • expiresAt - TTL cleanup queries
  • timestamp - Time-series queries

Query Optimization

// Use batch operations when possible
const batch = await env.DB.batch([
  env.DB.prepare('INSERT INTO storage_entries ...').bind(...),
  env.DB.prepare('INSERT INTO storage_entries ...').bind(...),
]);

// Use pagination for large result sets
const entries = await prisma.storageEntry.findMany({
  take: 100,
  skip: page * 100,
  orderBy: { createdAt: 'desc' }
});

Caching Layer

For frequently accessed data, combine D1 with Workers KV:

// Check KV cache first
let data = await env.KV.get(key, 'json');

if (!data) {
    // Fall back to D1
    data = await storage.get(key);

    // Cache in KV for faster access
    await env.KV.put(key, JSON.stringify(data), { expirationTtl: 300 });
}

Monitoring and Debugging

D1 Analytics

Access D1 metrics in Cloudflare Dashboard:

  • Query counts
  • Read/write operations
  • Storage usage
  • Query latency

Query Logging

const prisma = new PrismaClient({
    adapter,
    log: ['query', 'info', 'warn', 'error'],
});

Error Handling

try {
    await storage.set(['key'], value);
} catch (error) {
    if (error.message.includes('D1_ERROR')) {
        console.error('D1 database error:', error);
        // Implement retry logic or fallback
    }
    throw error;
}

Deployment

Deploy to Cloudflare Workers

# Deploy worker (production — top-level default, no --env flag needed)
wrangler deploy

# Deploy to development environment
wrangler deploy --env development

Environment Variables

Set via wrangler or Cloudflare Dashboard:

wrangler secret put CACHE_TTL
wrangler secret put DEBUG

CI/CD Integration

# .github/workflows/deploy.yml
name: Deploy to Cloudflare
on:
    push:
        branches: [main]

jobs:
    deploy:
        runs-on: ubuntu-latest
        steps:
            - uses: actions/checkout@v4

            - name: Setup Node
              uses: actions/setup-node@v4
              with:
                  node-version: '20'

            - name: Install dependencies
              run: npm ci

            - name: Generate Prisma
              run: npx prisma generate --schema=prisma/schema.d1.prisma

            - name: Run D1 migrations
              run: wrangler d1 migrations apply adblock-storage
              env:
                  CLOUDFLARE_API_TOKEN: ${{ secrets.CF_API_TOKEN }}

            - name: Deploy Worker
              run: wrangler deploy
              env:
                  CLOUDFLARE_API_TOKEN: ${{ secrets.CF_API_TOKEN }}

Limitations

D1 Constraints

  • Row size: Maximum 1MB per row
  • Database size: 10GB per database (free tier: 5GB)
  • Query complexity: Complex JOINs may be slower
  • Concurrent writes: Limited compared to distributed databases

Workarounds

For large filter lists:

// Split large content into chunks
const CHUNK_SIZE = 500000; // 500KB chunks
const chunks = splitIntoChunks(content, CHUNK_SIZE);

for (let i = 0; i < chunks.length; i++) {
    await storage.set(['cache', 'filters', source, `chunk-${i}`], chunks[i]);
}

Troubleshooting

Common Issues

"D1_ERROR: no such table"

  • Run migrations: wrangler d1 execute adblock-storage --file=migrations/0001_init.sql

"BINDING_NOT_FOUND"

  • Verify wrangler.toml has correct [[d1_databases]] configuration

"Query timeout"

  • Optimize query or add pagination
  • Check for missing indexes

Local vs Remote mismatch

  • Ensure migrations applied to both: --local and remote

Debug Commands

# List all tables
wrangler d1 execute adblock-storage --command="SELECT name FROM sqlite_master WHERE type='table'"

# Check table schema
wrangler d1 execute adblock-storage --command=".schema storage_entries"

# Count entries
wrangler d1 execute adblock-storage --command="SELECT COUNT(*) FROM storage_entries"

References

Cloudflare Workflows

This document describes the Cloudflare Workflows implementation in the adblock-compiler, providing durable execution for compilation, batch processing, cache warming, and health monitoring.

Table of Contents


Overview

Cloudflare Workflows provide durable execution for long-running operations. Unlike traditional queue-based processing, workflows offer:

  • Automatic state persistence between steps
  • Crash recovery - resumes from the last successful step
  • Built-in retry with configurable policies
  • Observable step-by-step progress
  • Reliable scheduled execution with cron triggers

Benefits over Queue-Based Processing

FeatureQueue-BasedWorkflows
State PersistenceManual (KV)Automatic
Crash RecoveryRe-process entire messageResume from checkpoint
Step VisibilityLimitedFull step-by-step
Retry LogicCustom implementationBuilt-in with backoff
Long-running Tasks30s limitUp to 15 minutes per step
Scheduled ExecutionExternal schedulerNative cron triggers

Available Workflows

CompilationWorkflow

Handles single async compilation requests with durable state between steps.

Steps:

  1. validate - Validate configuration
  2. compile-sources - Fetch and compile all sources
  3. cache-result - Compress and store in KV
  4. update-metrics - Update workflow metrics

Parameters:

interface CompilationParams {
  requestId: string;           // Unique tracking ID
  configuration: IConfiguration; // Filter list config
  preFetchedContent?: Record<string, string>; // Optional pre-fetched content
  benchmark?: boolean;         // Include benchmark metrics
  priority?: 'standard' | 'high';
  queuedAt: number;           // Timestamp
}

API Endpoint: POST /workflow/compile

curl -X POST http://localhost:8787/workflow/compile \
  -H "Content-Type: application/json" \
  -d '{
    "configuration": {
      "name": "My Filter List",
      "sources": [
        {"source": "https://easylist.to/easylist/easylist.txt", "name": "EasyList"}
      ],
      "transformations": ["Deduplicate", "RemoveEmptyLines"]
    },
    "priority": "high"
  }'

Response:

{
  "success": true,
  "message": "Compilation workflow started",
  "workflowId": "wf-compile-abc123",
  "workflowType": "compilation",
  "requestId": "wf-compile-abc123",
  "configName": "My Filter List"
}

BatchCompilationWorkflow

Processes multiple compilations with per-chunk durability and crash recovery.

Steps:

  1. validate-batch - Validate all configurations
  2. compile-chunk-N - Process chunks of 3 compilations in parallel
  3. update-batch-metrics - Update aggregate metrics

Parameters:

interface BatchCompilationParams {
  batchId: string;
  requests: Array<{
    id: string;
    configuration: IConfiguration;
    preFetchedContent?: Record<string, string>;
    benchmark?: boolean;
  }>;
  priority?: 'standard' | 'high';
  queuedAt: number;
}

API Endpoint: POST /workflow/batch

curl -X POST http://localhost:8787/workflow/batch \
  -H "Content-Type: application/json" \
  -d '{
    "requests": [
      {
        "id": "request-1",
        "configuration": {
          "name": "EasyList",
          "sources": [{"source": "https://easylist.to/easylist/easylist.txt"}]
        }
      },
      {
        "id": "request-2",
        "configuration": {
          "name": "EasyPrivacy",
          "sources": [{"source": "https://easylist.to/easylist/easyprivacy.txt"}]
        }
      }
    ],
    "priority": "standard"
  }'

CacheWarmingWorkflow

Pre-populates the cache with popular filter lists. Runs on schedule or manual trigger.

Steps:

  1. check-cache-status - Identify configurations needing refresh
  2. warm-chunk-N - Compile and cache configurations in chunks
  3. update-warming-metrics - Track warming statistics

Default Popular Configurations:

  • EasyList
  • EasyPrivacy
  • AdGuard Base

Parameters:

interface CacheWarmingParams {
  runId: string;
  configurations: IConfiguration[]; // Empty = use defaults
  scheduled: boolean;
}

API Endpoint: POST /workflow/cache-warm

# Trigger with default configurations
curl -X POST http://localhost:8787/workflow/cache-warm \
  -H "Content-Type: application/json" \
  -d '{}'

# Trigger with custom configurations
curl -X POST http://localhost:8787/workflow/cache-warm \
  -H "Content-Type: application/json" \
  -d '{
    "configurations": [
      {
        "name": "Custom List",
        "sources": [{"source": "https://example.com/filters.txt"}]
      }
    ]
  }'

Cron Schedule: Every 6 hours (0 */6 * * *)


HealthMonitoringWorkflow

Monitors filter source availability and alerts on failures.

Steps:

  1. load-health-history - Load recent health check history
  2. check-source-N - Check each source individually
  3. analyze-results - Detect consecutive failures for alerting
  4. send-alerts - Send alerts if threshold exceeded
  5. store-results - Persist health data

Default Sources Monitored:

  • EasyList (expected: 50,000+ rules)
  • EasyPrivacy (expected: 10,000+ rules)
  • AdGuard Base (expected: 30,000+ rules)
  • AdGuard Tracking Protection (expected: 10,000+ rules)
  • Peter Lowe's List (expected: 2,000+ rules)

Health Thresholds:

  • Max response time: 30 seconds
  • Failure threshold: 3 consecutive failures before alerting

Parameters:

interface HealthMonitoringParams {
  runId: string;
  sources: Array<{
    name: string;
    url: string;
    expectedMinRules?: number;
  }>; // Empty = use defaults
  alertOnFailure: boolean;
}

API Endpoint: POST /workflow/health-check

# Trigger with default sources
curl -X POST http://localhost:8787/workflow/health-check \
  -H "Content-Type: application/json" \
  -d '{"alertOnFailure": true}'

# Check custom sources
curl -X POST http://localhost:8787/workflow/health-check \
  -H "Content-Type: application/json" \
  -d '{
    "sources": [
      {"name": "My Source", "url": "https://example.com/filters.txt", "expectedMinRules": 100}
    ],
    "alertOnFailure": true
  }'

Cron Schedule: Every hour (0 * * * *)


API Endpoints

Workflow Management

MethodEndpointDescription
POST/workflow/compileStart compilation workflow
POST/workflow/batchStart batch compilation workflow
POST/workflow/cache-warmTrigger cache warming
POST/workflow/health-checkTrigger health monitoring
GET/workflow/status/:type/:idGet workflow instance status
GET/workflow/events/:idGet real-time progress events
GET/workflow/metricsGet aggregate workflow metrics
GET/health/latestGet latest health check results

Status Endpoint

Get the status of a running or completed workflow:

curl http://localhost:8787/workflow/status/compilation/wf-compile-abc123

Response:

{
  "success": true,
  "workflowType": "compilation",
  "workflowId": "wf-compile-abc123",
  "status": "complete",
  "output": {
    "success": true,
    "requestId": "wf-compile-abc123",
    "configName": "My Filter List",
    "ruleCount": 45000,
    "totalDurationMs": 2500
  }
}

Workflow Status Values:

  • queued - Waiting to start
  • running - Currently executing
  • paused - Manually paused
  • complete - Successfully finished
  • errored - Failed with error
  • terminated - Manually stopped
  • unknown - Status unavailable

Metrics Endpoint

Get aggregate metrics for all workflows:

curl http://localhost:8787/workflow/metrics

Response:

{
  "compilation": {
    "totalRuns": 150,
    "successfulRuns": 145,
    "failedRuns": 5,
    "avgDurationMs": 3200,
    "lastRunAt": "2024-01-15T10:30:00Z"
  },
  "batch": {
    "totalRuns": 25,
    "totalCompilations": 100,
    "avgDurationMs": 15000
  },
  "cacheWarming": {
    "totalRuns": 48,
    "scheduledRuns": 46,
    "manualRuns": 2,
    "totalConfigsWarmed": 144
  },
  "health": {
    "totalChecks": 168,
    "totalSourcesChecked": 840,
    "totalHealthy": 820,
    "alertsTriggered": 3
  }
}

Latest Health Results

Get the most recent health check results:

curl http://localhost:8787/health/latest

Response:

{
  "success": true,
  "timestamp": "2024-01-15T10:00:00Z",
  "runId": "cron-health-abc123",
  "results": [
    {
      "name": "EasyList",
      "url": "https://easylist.to/easylist/easylist.txt",
      "healthy": true,
      "statusCode": 200,
      "responseTimeMs": 450,
      "ruleCount": 72500
    },
    {
      "name": "EasyPrivacy",
      "url": "https://easylist.to/easylist/easyprivacy.txt",
      "healthy": true,
      "statusCode": 200,
      "responseTimeMs": 380,
      "ruleCount": 18200
    }
  ],
  "summary": {
    "total": 5,
    "healthy": 5,
    "unhealthy": 0
  }
}

Workflow Events (Real-Time Progress)

Get real-time progress events for a running workflow:

# Get all events for a workflow
curl http://localhost:8787/workflow/events/wf-compile-abc123

# Get events since a specific timestamp (for polling)
curl "http://localhost:8787/workflow/events/wf-compile-abc123?since=2024-01-15T10:30:00.000Z"

Response:

{
  "success": true,
  "workflowId": "wf-compile-abc123",
  "workflowType": "compilation",
  "startedAt": "2024-01-15T10:30:00.000Z",
  "completedAt": "2024-01-15T10:30:05.000Z",
  "progress": 100,
  "isComplete": true,
  "events": [
    {
      "type": "workflow:started",
      "workflowId": "wf-compile-abc123",
      "workflowType": "compilation",
      "timestamp": "2024-01-15T10:30:00.000Z",
      "data": {"configName": "My Filter List", "sourceCount": 2}
    },
    {
      "type": "workflow:step:started",
      "workflowId": "wf-compile-abc123",
      "workflowType": "compilation",
      "timestamp": "2024-01-15T10:30:00.100Z",
      "step": "validate"
    },
    {
      "type": "workflow:progress",
      "workflowId": "wf-compile-abc123",
      "workflowType": "compilation",
      "timestamp": "2024-01-15T10:30:00.500Z",
      "progress": 25,
      "message": "Configuration validated"
    },
    {
      "type": "workflow:completed",
      "workflowId": "wf-compile-abc123",
      "workflowType": "compilation",
      "timestamp": "2024-01-15T10:30:05.000Z",
      "data": {"ruleCount": 45000, "totalDurationMs": 5000}
    }
  ]
}

Event Types:

TypeDescription
workflow:startedWorkflow execution began
workflow:step:startedA workflow step started
workflow:step:completedA workflow step finished successfully
workflow:step:failedA workflow step failed
workflow:progressProgress update with percentage and message
workflow:completedWorkflow finished successfully
workflow:failedWorkflow failed with error
source:fetch:startedSource fetch operation started
source:fetch:completedSource fetch completed with rule count
transformation:startedTransformation step started
transformation:completedTransformation completed
cache:storedResult cached to KV
health:check:startedHealth check started for a source
health:check:completedHealth check completed

Polling for Real-Time Updates:

To monitor workflow progress in real-time, poll the events endpoint:

async function pollWorkflowEvents(workflowId) {
    let lastTimestamp = null;

    while (true) {
        const url = `/workflow/events/${workflowId}`;
        const params = lastTimestamp ? `?since=${encodeURIComponent(lastTimestamp)}` : '';

        const response = await fetch(url + params);
        const data = await response.json();

        if (data.events?.length > 0) {
            for (const event of data.events) {
                console.log(`[${event.type}] ${event.message || event.step || ''}`);
                lastTimestamp = event.timestamp;
            }
        }

        if (data.isComplete) {
            console.log('Workflow completed!');
            break;
        }

        await new Promise(resolve => setTimeout(resolve, 2000));
    }
}

Scheduled Workflows (Cron)

Workflows can be triggered automatically via cron schedules defined in wrangler.toml:

[triggers]
crons = [
    "0 */6 * * *",   # Cache warming: every 6 hours
    "0 * * * *",     # Health monitoring: every hour
]

The scheduled() handler routes cron events to the appropriate workflow:

Cron PatternWorkflowPurpose
0 */6 * * *CacheWarmingWorkflowPre-warm popular filter list caches
0 * * * *HealthMonitoringWorkflowMonitor source availability

Configuration

wrangler.toml

# Workflow bindings
[[workflows]]
name = "compilation-workflow"
binding = "COMPILATION_WORKFLOW"
class_name = "CompilationWorkflow"

[[workflows]]
name = "batch-compilation-workflow"
binding = "BATCH_COMPILATION_WORKFLOW"
class_name = "BatchCompilationWorkflow"

[[workflows]]
name = "cache-warming-workflow"
binding = "CACHE_WARMING_WORKFLOW"
class_name = "CacheWarmingWorkflow"

[[workflows]]
name = "health-monitoring-workflow"
binding = "HEALTH_MONITORING_WORKFLOW"
class_name = "HealthMonitoringWorkflow"

# Cron triggers
[triggers]
crons = [
    "0 */6 * * *",
    "0 * * * *",
]

Step Configuration

Each workflow step can have custom retry and timeout settings:

await step.do('step-name', {
    retries: {
        limit: 3,                    // Max retries
        delay: '30 seconds',         // Initial delay
        backoff: 'exponential',      // Backoff strategy
    },
    timeout: '5 minutes',            // Step timeout
}, async () => {
    // Step logic
});

Error Handling & Recovery

Automatic Retry

Each step has configurable retry policies:

  • Compilation steps: 2 retries with 30s exponential backoff, 5 minute timeout
  • Cache steps: 2 retries with 2s delay
  • Health checks: 2 retries with 5s delay, 2 minute timeout

Crash Recovery

If a workflow crashes mid-execution:

  1. Cloudflare detects the failure
  2. Workflow resumes from the last completed step
  3. State is automatically restored
  4. Processing continues without re-running completed steps

Dead Letter Handling

Failed workflows after max retries are logged with:

  • Full error details
  • Step that failed
  • Workflow parameters
  • Timestamp

Alerts can be configured via the health monitoring workflow to notify on persistent failures.


Workflow Diagrams

Compilation Workflow

flowchart TD
    Start[Workflow Start] --> Validate[Step: validate]
    Validate -->|Valid| Compile[Step: compile-sources]
    Validate -->|Invalid| Error[Return Error Result]

    Compile -->|Success| Cache[Step: cache-result]
    Compile -->|Retry| Compile
    Compile -->|Max Retries| Error

    Cache --> Metrics[Step: update-metrics]
    Metrics --> Complete[Return Success Result]

    Error --> Complete

    style Validate fill:#e1f5ff
    style Compile fill:#fff9c4
    style Cache fill:#c8e6c9
    style Metrics fill:#e1f5ff
    style Complete fill:#4caf50
    style Error fill:#ffcdd2

Batch Workflow with Chunking

flowchart TD
    Start[Workflow Start] --> ValidateBatch[Step: validate-batch]
    ValidateBatch --> Chunk1[Step: compile-chunk-1]

    Chunk1 --> Item1A[Compile Item 1]
    Chunk1 --> Item1B[Compile Item 2]
    Chunk1 --> Item1C[Compile Item 3]

    Item1A --> Chunk1Done
    Item1B --> Chunk1Done
    Item1C --> Chunk1Done

    Chunk1Done[Chunk 1 Complete] --> Chunk2[Step: compile-chunk-2]

    Chunk2 --> Item2A[Compile Item 4]
    Chunk2 --> Item2B[Compile Item 5]

    Item2A --> Chunk2Done
    Item2B --> Chunk2Done

    Chunk2Done[Chunk 2 Complete] --> Metrics[Step: update-batch-metrics]
    Metrics --> Complete[Return Batch Result]

    style ValidateBatch fill:#e1f5ff
    style Chunk1 fill:#fff9c4
    style Chunk2 fill:#fff9c4
    style Metrics fill:#e1f5ff
    style Complete fill:#4caf50

Health Monitoring Workflow

flowchart TD
    Start[Cron/Manual Trigger] --> LoadHistory[Step: load-health-history]
    LoadHistory --> CheckSource1[Step: check-source-1]
    CheckSource1 --> Delay1[Sleep 2s]
    Delay1 --> CheckSource2[Step: check-source-2]
    CheckSource2 --> Delay2[Sleep 2s]
    Delay2 --> CheckSourceN[Step: check-source-N]

    CheckSourceN --> Analyze[Step: analyze-results]
    Analyze -->|Alerts Needed| SendAlerts[Step: send-alerts]
    Analyze -->|No Alerts| Store
    SendAlerts --> Store[Step: store-results]

    Store --> Complete[Return Health Result]

    style LoadHistory fill:#e1f5ff
    style CheckSource1 fill:#fff9c4
    style CheckSource2 fill:#fff9c4
    style CheckSourceN fill:#fff9c4
    style Analyze fill:#ffe0b2
    style SendAlerts fill:#ffcdd2
    style Store fill:#c8e6c9
    style Complete fill:#4caf50

Notes

  • Workflows are available when deployed to Cloudflare Workers
  • Local development may use stubs for workflow bindings
  • Metrics are stored in the METRICS KV namespace
  • Cached results use the COMPILATION_CACHE KV namespace
  • Health history is retained for 30 days
  • Workflow instances can be monitored in the Cloudflare dashboard

Queue Diagnostic Events

This document describes how diagnostic events are emitted during queue-based compilation operations.

Overview

The adblock-compiler queue system emits comprehensive diagnostic events throughout the compilation lifecycle, providing full observability into asynchronous compilation jobs.

Event Flow

1. Queue Message Received

When a queue consumer receives a compilation message:

// Create tracing context with metadata
const tracingContext = createTracingContext({
    metadata: {
        endpoint: 'queue/compile',
        configName: configuration.name,
        requestId: message.requestId,
        timestamp: message.timestamp,
        cacheKey: cacheKey || undefined,
    },
});

2. Compilation Execution

The tracing context is passed to the compiler:

const compiler = new WorkerCompiler({
    preFetchedContent,
    tracingContext,  // Enables diagnostic collection
});

const result = await compiler.compileWithMetrics(configuration, benchmark ?? false);

3. Diagnostic Emission

After compilation completes, all diagnostic events are emitted to the tail worker:

if (result.diagnostics) {
    console.log(`[QUEUE:COMPILE] Emitting ${result.diagnostics.length} diagnostic events`);
    emitDiagnosticsToTailWorker(result.diagnostics);
}

Diagnostic Event Types

Queue compilations emit the same diagnostic events as synchronous compilations:

Operation Events

  • operationStart: Start of operations like validation, source compilation, transformations
  • operationComplete: Successful completion with result metadata
  • operationError: Operation failures with error details

Network Events

  • network: HTTP requests for downloading filter lists
    • Request details (URL, method, headers)
    • Response metadata (status, size, duration)
    • Error information for failed requests

Cache Events

  • cache: Cache operations during compilation
    • Cache hits/misses
    • Compression statistics
    • Storage operations

Performance Events

  • performanceMetric: Performance measurements
    • Operation durations
    • Resource usage
    • Throughput metrics

Tracing Context Metadata

Each diagnostic event includes metadata from the tracing context:

{
  "endpoint": "queue/compile",
  "configName": "AdGuard DNS filter",
  "requestId": "compile-1704931200000-abc123",
  "timestamp": 1704931200000,
  "cacheKey": "cache:a1b2c3d4e5f6..."
}

This metadata allows correlation of diagnostic events with specific queue jobs.

Tail Worker Integration

Diagnostic events are emitted through console logging with structured JSON:

function emitDiagnosticsToTailWorker(diagnostics: DiagnosticEvent[]): void {
    // Summary
    console.log('[DIAGNOSTICS]', JSON.stringify({
        eventCount: diagnostics.length,
        timestamp: new Date().toISOString(),
    }));
    
    // Individual events
    for (const event of diagnostics) {
        const logData = {
            ...event,
            source: 'adblock-compiler',
        };
        
        // Use appropriate log level based on severity
        switch (event.severity) {
            case 'error':
                console.error('[DIAGNOSTIC]', JSON.stringify(logData));
                break;
            case 'warn':
                console.warn('[DIAGNOSTIC]', JSON.stringify(logData));
                break;
            case 'info':
                console.info('[DIAGNOSTIC]', JSON.stringify(logData));
                break;
            default:
                console.debug('[DIAGNOSTIC]', JSON.stringify(logData));
        }
    }
}

Log Prefixes

Queue operations use structured logging prefixes for easy filtering:

PrefixPurpose
[QUEUE:HANDLER]Queue consumer batch processing
[QUEUE:COMPILE]Single compilation processing
[QUEUE:BATCH]Batch compilation processing
[QUEUE:CACHE-WARM]Cache warming processing
[QUEUE:CHUNKS]Chunk-based parallel processing
[DIAGNOSTICS]Diagnostic event summary
[DIAGNOSTIC]Individual diagnostic event

Example Diagnostic Flow

Complete Compilation Lifecycle

1. [QUEUE:COMPILE] Starting compilation for "AdGuard DNS filter" (requestId: compile-123)
2. [QUEUE:COMPILE] Cache key: cache:a1b2c3d4e5f6...
3. [DIAGNOSTIC] { eventType: "operationStart", operation: "validateConfiguration", ... }
4. [DIAGNOSTIC] { eventType: "operationComplete", operation: "validateConfiguration", ... }
5. [DIAGNOSTIC] { eventType: "operationStart", operation: "compileSources", ... }
6. [DIAGNOSTIC] { eventType: "network", url: "https://...", duration: 234, ... }
7. [DIAGNOSTIC] { eventType: "operationComplete", operation: "downloadSource", ... }
8. [DIAGNOSTIC] { eventType: "operationComplete", operation: "compileSources", ... }
9. [DIAGNOSTIC] { eventType: "performanceMetric", metric: "totalCompilationTime", ... }
10. [QUEUE:COMPILE] Compilation completed in 2345ms, 12500 rules generated
11. [DIAGNOSTICS] { eventCount: 15, timestamp: "2024-01-14T04:00:00.000Z" }
12. [QUEUE:COMPILE] Cached compilation in 123ms (1234567 -> 345678 bytes, 72.0% compression)
13. [QUEUE:COMPILE] Total processing time: 2468ms for "AdGuard DNS filter"

Monitoring Diagnostic Events

Using Wrangler CLI

Stream queue diagnostics in real-time:

# All diagnostics
wrangler tail | grep "DIAGNOSTIC"

# Only errors
wrangler tail | grep "DIAGNOSTIC.*error"

# Specific config
wrangler tail | grep "AdGuard DNS filter"

Using Cloudflare Dashboard

  1. Navigate to Workers & Pages > Your Worker
  2. Click Logs tab
  3. Filter by:
    • Prefix: [DIAGNOSTIC]
    • Severity: error, warn, info, debug
    • Request ID: compile-*, batch-*, warm-*

Using Tail Worker

Configure a tail worker in wrangler.toml to export diagnostics:

[[tail_consumers]]
service = "adblock-compiler-tail-worker"

The tail worker can:

  • Forward to external monitoring (Datadog, Splunk, etc.)
  • Aggregate metrics
  • Trigger alerts on errors
  • Store for analysis

Diagnostic Event Schema

Example: Source Download

{
  "eventType": "network",
  "category": "network",
  "severity": "info",
  "timestamp": "2024-01-14T04:00:00.000Z",
  "traceId": "trace-123",
  "spanId": "span-456",
  "metadata": {
    "endpoint": "queue/compile",
    "configName": "AdGuard DNS filter",
    "requestId": "compile-1704931200000-abc123",
    "timestamp": 1704931200000,
    "cacheKey": "cache:a1b2c3d4e5f6..."
  },
  "url": "https://adguardteam.github.io/.../filter.txt",
  "method": "GET",
  "statusCode": 200,
  "duration": 234,
  "size": 123456
}

Example: Transformation Complete

{
  "eventType": "operationComplete",
  "category": "operation",
  "severity": "info",
  "timestamp": "2024-01-14T04:00:01.000Z",
  "operation": "applyTransformation",
  "metadata": {
    "endpoint": "queue/compile",
    "configName": "AdGuard DNS filter",
    "requestId": "compile-1704931200000-abc123"
  },
  "transformation": "Deduplicate",
  "inputCount": 12600,
  "outputCount": 12500,
  "duration": 45
}

Comparison: Queue vs Synchronous

AspectSynchronous (/compile)Queue (/compile/async)
Diagnostic Events✅ Emitted✅ Emitted
Tracing Context✅ Included✅ Included
Real-time Stream✅ Via SSE (/compile/stream)❌ No (async processing)
Tail Worker✅ Emitted✅ Emitted
Request IDGenerated per request✅ Tracked in queue
MetadataBasic✅ Enhanced (requestId, timestamp, priority)

Best Practices

1. Include Request IDs

Always reference the requestId when investigating queue jobs:

wrangler tail | grep "compile-1704931200000-abc123"

2. Monitor Error Events

Set up alerts for diagnostic events with severity: "error":

// In tail worker
if (event.severity === 'error') {
    await sendToAlertingSystem(event);
}

3. Track Performance Metrics

Aggregate performance metrics from diagnostic events:

const metrics = diagnostics
    .filter(e => e.eventType === 'performanceMetric')
    .reduce((acc, e) => {
        acc[e.metric] = e.value;
        return acc;
    }, {});

4. Correlate with Queue Stats

Combine diagnostic events with queue statistics for complete visibility:

# Get queue stats
curl https://your-worker.dev/queue/stats

# Stream diagnostics
wrangler tail | grep "DIAGNOSTIC"

Troubleshooting

Missing Diagnostics

If diagnostic events aren't being emitted:

  1. Check tracing context creation:

    const tracingContext = createTracingContext({ metadata });
    
  2. Verify compiler initialization:

    const compiler = new WorkerCompiler({ tracingContext });
    
  3. Confirm emission call:

    emitDiagnosticsToTailWorker(result.diagnostics);
    

Incomplete Events

If events are missing details:

  • Ensure metadata is complete when creating tracing context
  • Check that event handlers are properly configured
  • Verify tail worker is receiving all console output

Performance Impact

Diagnostic emission has minimal overhead:

  • Events collected during compilation (already happening)
  • Emission is fire-and-forget (doesn't block)
  • Structured logging is optimized for Cloudflare Workers

Summary

Queue operations emit full diagnostic eventsTracing context includes queue-specific metadataEvents are logged to tail worker with structured prefixesSame diagnostic events as synchronous operationsFull observability into asynchronous compilation

Queue-based compilation provides the same level of diagnostic observability as synchronous compilation, with additional metadata for tracking asynchronous job lifecycle.

Cloudflare Queue Support

This document describes how to use the Cloudflare Queue integration for async compilation jobs.

Overview

The adblock-compiler worker now supports asynchronous compilation through Cloudflare Queues. This is useful for:

  • Long-running compilations - Offload CPU-intensive work to background processing
  • Batch operations - Process multiple compilations without blocking
  • Cache warming - Pre-compile popular filter lists asynchronously
  • Rate limit bypass - Queue requests that would otherwise be rate-limited
  • Priority processing - Premium users and urgent compilations get faster processing

See Also: Queue Architecture Diagram for visual representation of the queue flow.

Queue Configuration

The worker uses two queues for different priority levels:

# Standard priority queue
[[queues.producers]]
 queue = "adblock-compiler-worker-queue"
 binding = "ADBLOCK_COMPILER_QUEUE"

# High priority queue for premium users
[[queues.producers]]
 queue = "adblock-compiler-worker-queue-high-priority"
 binding = "ADBLOCK_COMPILER_QUEUE_HIGH_PRIORITY"

# Standard queue consumer
[[queues.consumers]]
 queue = "adblock-compiler-worker-queue"
 max_batch_size = 10
 max_batch_timeout = 5
 dead_letter_queue = "dead-letter-queue"

# High priority queue consumer (faster processing)
[[queues.consumers]]
 queue = "adblock-compiler-worker-queue-high-priority"
 max_batch_size = 5     # smaller batches for faster response
 max_batch_timeout = 2  # shorter timeout for quicker processing
 dead_letter_queue = "dead-letter-queue"

Priority Levels

The worker supports two priority levels:

  • standard (default) - Normal processing speed, larger batches
  • high - Faster processing with smaller batches and shorter timeouts

High priority jobs are routed to a separate queue with optimized settings for faster turnaround.

API Endpoints

POST /compile/async

Queue a single compilation job for asynchronous processing.

Request Body:

{
    "configuration": {
        "name": "My Filter List",
        "sources": [
            {
                "source": "https://example.com/filters.txt"
            }
        ],
        "transformations": ["Deduplicate", "RemoveEmptyLines"]
    },
    "benchmark": true,
    "priority": "high"
}

Fields:

  • configuration (required) - Compilation configuration
  • benchmark (optional) - Enable benchmarking
  • priority (optional) - Priority level: "standard" (default) or "high"

Response (202 Accepted):

{
    "success": true,
    "message": "Compilation job queued successfully",
    "note": "The compilation will be processed asynchronously and cached when complete",
    "requestId": "compile-1704931200000-abc123",
    "priority": "high"
}

POST /compile/batch/async

Queue multiple compilation jobs for asynchronous processing.

Request Body:

{
    "requests": [
        {
            "id": "filter-1",
            "configuration": {
                "name": "Filter List 1",
                "sources": [
                    {
                        "source": "https://example.com/filter1.txt"
                    }
                ]
            }
        },
        {
            "id": "filter-2",
            "configuration": {
                "name": "Filter List 2",
                "sources": [
                    {
                        "source": "https://example.com/filter2.txt"
                    }
                ]
            }
        }
    ],
    "priority": "high"
}

Fields:

  • requests (required) - Array of compilation requests
  • priority (optional) - Priority level for the entire batch: "standard" (default) or "high"

Response (202 Accepted):

{
    "success": true,
    "message": "Batch of 2 compilation jobs queued successfully",
    "note": "The compilations will be processed asynchronously and cached when complete",
    "requestId": "batch-1704931200000-def456",
    "batchSize": 2,
    "priority": "high"
}

Limits:

  • Maximum 100 requests per batch
  • No rate limiting (queue handles backpressure)

Queue Message Types

The worker processes three types of queue messages, all supporting optional priority:

1. Compile Message

Single compilation job with optional pre-fetched content, benchmarking, and priority.

{
  type: 'compile',
  requestId: 'compile-123',
  timestamp: 1704931200000,
  priority: 'high',  // or 'standard' (default)
  configuration: { /* IConfiguration */ },
  preFetchedContent?: { /* url: content */ },
  benchmark?: boolean
}

2. Batch Compile Message

Multiple compilation jobs processed in parallel with optional priority.

{
  type: 'batch-compile',
  requestId: 'batch-123',
  timestamp: 1704931200000,
  priority: 'high',  // or 'standard' (default)
  requests: [
    {
      id: 'req-1',
      configuration: { /* IConfiguration */ },
      preFetchedContent?: { /* url: content */ },
      benchmark?: boolean
    },
    // ... more requests
  ]
}

3. Cache Warm Message

Pre-compile multiple configurations to warm the cache with optional priority.

{
  type: 'cache-warm',
  requestId: 'warm-123',
  timestamp: 1704931200000,
  priority: 'high',  // or 'standard' (default)
  configurations: [
    { /* IConfiguration */ },
    // ... more configurations
  ]
}

How It Works

  1. Request - Client sends a POST request to /compile/async or /compile/batch/async with optional priority field
  2. Routing - Worker routes the message to the appropriate queue based on priority level
  3. Response - Worker immediately returns 202 Accepted with the priority level
  4. Processing - Queue consumer processes the message asynchronously
  5. Caching - Compiled results are cached in KV storage
  6. Retrieval - Client can later retrieve cached results via /compile endpoint

Retry Behavior

The queue consumer automatically retries failed messages:

  • Success - Message is acknowledged and removed from queue
  • Failure - Message is retried with exponential backoff
  • Unknown Type - Message is acknowledged to prevent infinite retries

Benefits

Compared to Synchronous Endpoints

FeatureSync (/compile)Async (/compile/async)
Response TimeWaits for compilationImmediate (202 Accepted)
Rate LimitingYes (10 req/min)No (queue handles backpressure)
CPU UsageBlocks workerBackground processing
Use CaseInteractive requestsBatch operations, pre-warming

Use Cases

Cache Warming

# Pre-compile popular filter lists during low-traffic periods
curl -X POST https://your-worker.dev/compile/async \
  -H "Content-Type: application/json" \
  -d '{
    "configuration": {
      "name": "AdGuard DNS filter",
      "sources": [{
        "source": "https://adguardteam.github.io/AdGuardSDNSFilter/Filters/filter.txt"
      }]
    }
  }'

Batch Processing

# Process multiple filter lists without blocking
curl -X POST https://your-worker.dev/compile/batch/async \
  -H "Content-Type: application/json" \
  -d '{
    "requests": [
      {"id": "adguard", "configuration": {...}},
      {"id": "easylist", "configuration": {...}},
      {"id": "easyprivacy", "configuration": {...}}
    ]
  }'

Monitoring and Tracing

Queue processing includes comprehensive logging and diagnostics for observability.

Logging Prefixes

All queue operations use structured logging with prefixes for easy filtering:

  • [QUEUE:HANDLER] - Queue consumer batch processing
  • [QUEUE:COMPILE] - Individual compilation processing
  • [QUEUE:BATCH] - Batch compilation processing
  • [QUEUE:CACHE-WARM] - Cache warming processing
  • [QUEUE:CHUNKS] - Chunk-based parallel processing
  • [API:ASYNC] - Async API endpoint operations
  • [API:BATCH-ASYNC] - Batch async API endpoint operations

Log Monitoring

Queue processing is logged to the console and can be monitored via:

  • Cloudflare Dashboard > Workers & Pages > Your Worker > Logs
  • Tail Worker (if configured) - Real-time log streaming
  • Analytics Engine (if configured) - Aggregated metrics
  • Wrangler CLI - wrangler tail for live log streaming

Example Log Output

[API:ASYNC] Queueing compilation for "AdGuard DNS filter"
[API:ASYNC] Queued successfully in 45ms (requestId: compile-1704931200000-abc123)

[QUEUE:HANDLER] Processing batch of 3 messages
[QUEUE:HANDLER] Processing message 1/3, type: compile, requestId: compile-1704931200000-abc123

[QUEUE:COMPILE] Starting compilation for "AdGuard DNS filter" (requestId: compile-1704931200000-abc123)
[QUEUE:COMPILE] Cache key: cache:a1b2c3d4e5f6g7h8...
[QUEUE:COMPILE] Compilation completed in 2345ms, 12500 rules generated
[QUEUE:COMPILE] Emitting 15 diagnostic events
[QUEUE:COMPILE] Cached compilation in 123ms (1234567 -> 345678 bytes, 72.0% compression)
[QUEUE:COMPILE] Total processing time: 2468ms for "AdGuard DNS filter"

[QUEUE:HANDLER] Message 1/3 completed in 2470ms and acknowledged
[QUEUE:HANDLER] Batch complete: 3 messages processed in 7234ms (avg 2411ms per message). Acked: 3, Retried: 0, Unknown: 0

Tracing and Diagnostics

Each compilation includes a tracing context that captures:

  • Metadata: Endpoint, config name, request ID, timestamp
  • Diagnostic Events: Source downloads, transformations, validation
  • Performance Metrics: Duration, rule counts, compression ratios
  • Error Details: Stack traces, error messages, retry attempts

Diagnostic events are emitted to the tail worker for centralized monitoring:

{
  "eventType": "source:complete",
  "sourceIndex": 0,
  "ruleCount": 12500,
  "durationMs": 1234,
  "metadata": {
    "endpoint": "queue/compile",
    "configName": "AdGuard DNS filter",
    "requestId": "compile-1704931200000-abc123"
  }
}

Performance Metrics

The following metrics are logged for each operation:

  • Enqueue Time: Time to queue the message
  • Processing Time: Total compilation duration
  • Compression Ratio: Storage reduction percentage
  • Cache Operations: Time to compress and store
  • Success/Failure Rate: Per message and per batch
  • Chunk Processing: Parallel processing statistics

Monitoring Tools

  1. Real-time Logs

    # Stream logs in real-time
    wrangler tail
    
    # Filter by prefix
    wrangler tail | grep "QUEUE:COMPILE"
    
  2. Cloudflare Dashboard

    • Navigate to Workers & Pages > Your Worker
    • View Logs tab for historical logs
    • Use Analytics tab for aggregated metrics
  3. Tail Worker Integration

    • Configured in wrangler.toml
    • Processes all console logs
    • Can export to external services

Error Handling

Errors during queue processing are:

  1. Logged to console with full error details
  2. Message is retried automatically with exponential backoff
  3. After max retries, message is sent to dead letter queue (if configured)
  4. Error metrics are tracked and reported

Error Log Example

[QUEUE:COMPILE] Processing failed after 5234ms for "Invalid Filter": 
  Error: Source download failed: Network timeout
[QUEUE:HANDLER] Message 2/5 failed after 5236ms, will retry: 
  Error: Source download failed: Network timeout

Performance Considerations

Queue Configuration

  • Standard queue: Processes messages in batches (max 10), timeout 5 seconds
  • High-priority queue: Smaller batches (max 5), shorter timeout (2 seconds) for faster response
  • Batch compilations process requests in chunks of 3 in parallel
  • Cache TTL is 1 hour (configurable in worker code)

Processing Times

  • Large filter lists may take several seconds to compile
  • High-priority jobs are processed faster due to smaller batch sizes
  • Compression reduces storage by 70-80%
  • Gzip compression/decompression adds ~100ms overhead

Priority Queue Benefits

  • High priority: Faster turnaround time, ideal for premium users or urgent requests
  • Standard priority: Higher throughput, ideal for batch operations and scheduled jobs

Local Development

To test queue functionality locally (including priority):

# Start the worker in development mode
deno task wrangler:dev

# In another terminal, send a standard priority request
curl -X POST http://localhost:8787/compile/async \
  -H "Content-Type: application/json" \
  -d '{
    "configuration": {
      "name": "Test",
      "sources": [{"source": "https://example.com/test.txt"}]
    }
  }'

# Send a high priority request
curl -X POST http://localhost:8787/compile/async \
  -H "Content-Type: application/json" \
  -d '{
    "configuration": {
      "name": "Urgent Test",
      "sources": [{"source": "https://example.com/urgent.txt"}]
    },
    "priority": "high"
  }'

Note: Local development mode simulates queue behavior but doesn't persist messages.

Deployment

Ensure both queues are created before deploying:

# Create the standard priority queue (first time only)
wrangler queues create adblock-compiler-worker-queue

# Create the high priority queue (first time only)
wrangler queues create adblock-compiler-worker-queue-high-priority

# Deploy the worker
deno task wrangler:deploy

Troubleshooting

Queue not processing messages

  • Check queue configuration in wrangler.toml
  • Verify both queues exist: wrangler queues list
  • Check worker logs for errors

Messages failing repeatedly

  • Check error logs for specific failure reasons
  • Verify source URLs are accessible
  • Check KV namespace bindings are correct

Slow processing

  • Increase max_batch_size in wrangler.toml
  • Consider scaling worker resources
  • Review filter list sizes and complexity

Architecture

Queue Flow Diagram

graph TB
    subgraph "Client Layer"
        CLIENT[Client/Browser]
    end

    subgraph "API Endpoints"
        ASYNC_EP[POST /compile/async]
        BATCH_EP[POST /compile/batch/async]
        SYNC_EP[POST /compile]
    end

    subgraph "Queue Producer"
        ENQUEUE[Queue Message Producer]
        GEN_ID[Generate Request ID]
        CREATE_MSG[Create Queue Message]
    end

    subgraph "Cloudflare Queue"
        QUEUE[(adblock-compiler-worker-queue)]
        QUEUE_HIGH[(adblock-compiler-worker-queue-high-priority)]
        QUEUE_BATCH[Message Batching]
    end

    subgraph "Queue Consumer"
        CONSUMER[Queue Consumer Handler]
        DISPATCHER[Message Type Dispatcher]
        COMPILE_PROC[Process Compile Message]
        BATCH_PROC[Process Batch Message]
        CACHE_PROC[Process Cache Warm Message]
    end

    subgraph "Storage Layer"
        KV_CACHE[(KV: COMPILATION_CACHE)]
        COMPRESS[Gzip Compression]
    end

    CLIENT -->|POST request| ASYNC_EP
    CLIENT -->|POST request| BATCH_EP
    CLIENT -->|GET cached result| SYNC_EP

    ASYNC_EP -->|Queue message| ENQUEUE
    BATCH_EP -->|Queue message| ENQUEUE

    ENQUEUE --> GEN_ID
    GEN_ID --> CREATE_MSG
    CREATE_MSG -->|standard priority| QUEUE
    CREATE_MSG -->|high priority| QUEUE_HIGH

    QUEUE --> QUEUE_BATCH
    QUEUE_HIGH --> QUEUE_BATCH
    QUEUE_BATCH -->|Batched messages| CONSUMER

    CONSUMER --> DISPATCHER
    DISPATCHER -->|type: 'compile'| COMPILE_PROC
    DISPATCHER -->|type: 'batch-compile'| BATCH_PROC
    DISPATCHER -->|type: 'cache-warm'| CACHE_PROC

    COMPILE_PROC --> COMPRESS
    COMPRESS --> KV_CACHE

    SYNC_EP -.->|Read cache| KV_CACHE

    style QUEUE fill:#f9f,stroke:#333,stroke-width:4px
    style QUEUE_HIGH fill:#ff9,stroke:#333,stroke-width:4px
    style CONSUMER fill:#bbf,stroke:#333,stroke-width:4px
    style KV_CACHE fill:#bfb,stroke:#333,stroke-width:2px

Message Flow Sequence

sequenceDiagram
    participant C as Client
    participant API as API Endpoint
    participant Q as Queue
    participant QC as Queue Consumer
    participant Comp as Compiler
    participant Cache as KV Cache

    Note over C,Cache: Async Compile Flow

    C->>API: POST /compile/async
    API->>API: Generate Request ID
    API->>Q: Send CompileQueueMessage
    API-->>C: 202 Accepted (requestId)

    Q->>QC: Deliver message batch
    QC->>QC: Dispatch by type
    QC->>Comp: Execute compilation
    Comp-->>QC: Compiled rules + metrics
    QC->>Cache: Store compressed result
    QC->>Q: ACK message

    Note over C,Cache: Cache Result Retrieval

    C->>API: POST /compile (with config)
    API->>Cache: Check for cached result
    Cache-->>API: Compressed result
    API-->>C: 200 OK (rules, cached: true)

Processing Flow

flowchart TD
    START[Queue Message Received] --> VALIDATE{Validate Message Type}

    VALIDATE -->|compile| SINGLE[Single Compilation]
    VALIDATE -->|batch-compile| BATCH[Batch Compilation]
    VALIDATE -->|cache-warm| WARM[Cache Warming]
    VALIDATE -->|unknown| UNKNOWN[Unknown Type]

    SINGLE --> COMP1[Run Compilation]
    COMP1 --> COMPRESS1[Compress Result]
    COMPRESS1 --> STORE1[Store in KV]
    STORE1 --> ACK1[ACK Message]

    BATCH --> CHUNK[Split into Chunks of 3]
    CHUNK --> PARALLEL[Process Chunks in Parallel]
    PARALLEL --> STATS{All Successful?}
    STATS -->|Yes| ACK2[ACK Message]
    STATS -->|No| RETRY2[RETRY Message]

    WARM --> CHUNK2[Split into Chunks]
    CHUNK2 --> PARALLEL2[Process in Parallel]
    PARALLEL2 --> ACK3[ACK Message]

    UNKNOWN --> ACK_UNK[ACK to prevent infinite retries]

    ACK1 --> END[Processing Complete]
    ACK2 --> END
    ACK3 --> END
    ACK_UNK --> END
    RETRY2 --> RETRY_QUEUE[Back to Queue with Backoff]

Key Features

  • Asynchronous Processing: Non-blocking API endpoints with immediate 202 response
  • Priority Queues: Two-tier system for standard and high-priority processing
  • Concurrency Control: Chunked batch processing (max 3 parallel compilations)
  • Caching: Gzip compression reduces storage by 70-80%
  • Error Handling: Automatic retry with exponential backoff
  • Monitoring: Structured logging with prefixes for easy filtering

Further Reading

End-to-End Tests

Automated end-to-end tests for the Adblock Compiler API and WebSocket endpoints.

Overview

The e2e test suite includes:

  • API Tests (api.e2e.test.ts) - HTTP endpoint testing

    • Core API endpoints
    • Compilation and batch compilation
    • Streaming (SSE)
    • Queue operations
    • Performance testing
    • Error handling
  • WebSocket Tests (websocket.e2e.test.ts) - Real-time connection testing

    • Connection lifecycle
    • Real-time compilation
    • Session management
    • Event streaming
    • Error handling

Prerequisites

The e2e tests require a running server instance. You have two options:

Option 1: Local Development Server

# In terminal 1 - Start the development server
deno task dev

# In terminal 2 - Run the e2e tests
deno task test:e2e

Option 2: Test Against Remote Server

# Set the E2E_BASE_URL environment variable
E2E_BASE_URL=https://adblock-compiler.jayson-knight.workers.dev deno task test:e2e

Running Tests

Run All E2E Tests

deno task test:e2e

This runs both API and WebSocket tests.

Run Only API Tests

deno task test:e2e:api

Run Only WebSocket Tests

deno task test:e2e:ws

Run Individual Test Files

# API tests only
deno test --allow-net worker/api.e2e.test.ts

# WebSocket tests only
deno test --allow-net worker/websocket.e2e.test.ts

Run Specific Tests

# Run tests matching a pattern
deno test --allow-net --filter "compile" worker/api.e2e.test.ts

Test Coverage

API Tests (21 tests)

Core API (8 tests)

  • ✅ GET /api - API information
  • ✅ GET /api/version - version information
  • ✅ GET /metrics - metrics data
  • ✅ POST /compile - simple compilation
  • ✅ POST /compile - with transformations
  • ✅ POST /compile - cache behavior
  • ✅ POST /compile/batch - batch compilation
  • ✅ POST /compile - error handling

Streaming (1 test)

  • ✅ POST /compile/stream - SSE streaming

Queue (4 tests)

  • ✅ GET /queue/stats - queue statistics
  • ✅ POST /compile/async - async compilation
  • ✅ POST /compile/batch/async - async batch compilation
  • ✅ GET /queue/results/{id} - retrieve results

Performance (3 tests)

  • ✅ Response time < 2s
  • ✅ Concurrent requests (5 parallel)
  • ✅ Large batch (10 items)

Error Handling (3 tests)

  • ✅ Invalid JSON
  • ✅ Missing configuration
  • ✅ CORS headers

Additional (2 tests)

  • ✅ GET / - web UI
  • ✅ GET /api/deployments - deployment history

WebSocket Tests (9 tests)

Connection (2 tests)

  • ✅ Connection establishment
  • ✅ Receives welcome message

Compilation (2 tests)

  • ✅ Compile with streaming events
  • ✅ Multiple messages in session

Error Handling (2 tests)

  • ✅ Invalid message format
  • ✅ Invalid configuration

Lifecycle (2 tests)

  • ✅ Graceful disconnect
  • ✅ Reconnection capability

Event Streaming (1 test)

  • ✅ Receives progress events

Test Behavior

Skipped Tests

Tests are automatically skipped if:

  • Server not available - Tests will be marked as "ignored" if the server at BASE_URL is not responding
  • WebSocket not available - WebSocket tests will be skipped if the WebSocket endpoint is not accessible

You'll see warnings like:

⚠️  Server not available at http://localhost:8787
   Start the server with: deno task dev

Queue Tests

Queue-related tests accept multiple response statuses:

  • 200 - Queue is configured and operational
  • 500 - Queue not available (expected in local development)
  • 202 - Job successfully queued

This allows tests to pass in both local and production environments.

Configuration

Environment Variables

  • E2E_BASE_URL - Base URL for the server (default: http://localhost:8787)

Example:

E2E_BASE_URL=https://my-deployment.workers.dev deno task test:e2e

Timeouts

Default timeouts can be adjusted in the test files:

  • API Tests: 10 seconds per test (15s for large batches)
  • WebSocket Tests: 5-15 seconds depending on test type

Debugging

View Detailed Output

# Run with verbose output
deno test --allow-net --v8-flags=--expose-gc worker/api.e2e.test.ts

Run Single Test

# Run a specific test by name
deno test --allow-net --filter "GET /api" worker/api.e2e.test.ts

Check Server Status

# Verify server is running
curl http://localhost:8787/api

# Check WebSocket endpoint
curl -i -N -H "Connection: Upgrade" -H "Upgrade: websocket" http://localhost:8787/ws/compile

CI/CD Integration

GitHub Actions Example

name: E2E Tests

on: [push, pull_request]

jobs:
    e2e:
        runs-on: ubuntu-latest
        steps:
            - uses: actions/checkout@v4

            - uses: denoland/setup-deno@v1
              with:
                  deno-version: v2.x

            - name: Start server
              run: |
                  deno task dev &
                  sleep 10

            - name: Run E2E tests
              run: deno task test:e2e

With Wrangler

- name: Start Wrangler
  run: |
      npm install -g wrangler@3.96.0
      wrangler dev --port 8787 &
      sleep 10

- name: Run E2E tests
  run: deno task test:e2e

Writing New Tests

API Test Template

Deno.test({
    name: 'E2E: <endpoint> - <description>',
    ignore: !serverAvailable,
    fn: async () => {
        const response = await fetchWithTimeout(`${BASE_URL}/endpoint`);

        assertEquals(response.status, 200);

        const data = await response.json();
        assertExists(data.field);
    },
});

WebSocket Test Template

Deno.test({
    name: 'E2E: WebSocket - <description>',
    ignore: !wsAvailable,
    fn: async () => {
        const ws = new WebSocket(`${WS_URL}/ws/compile`);

        await new Promise<void>((resolve, reject) => {
            const timeout = setTimeout(() => {
                ws.close();
                reject(new Error('Test timeout'));
            }, 10000);

            ws.addEventListener('message', (event) => {
                // Test logic
                clearTimeout(timeout);
                ws.close();
                resolve();
            });

            ws.addEventListener('error', () => {
                clearTimeout(timeout);
                reject(new Error('WebSocket error'));
            });
        });
    },
});

Comparison with HTML E2E Tests

The project includes both:

  1. Automated E2E Tests (these files)

    • Run via command line
    • Suitable for CI/CD
    • Comprehensive test coverage
    • Automated assertions
  2. HTML E2E Dashboard (/e2e-tests.html)

    • Interactive browser-based testing
    • Visual feedback
    • Manual execution
    • Real-time monitoring

Both approaches are complementary and test the same endpoints.

Troubleshooting

"Server not available" Error

Problem: Tests skip because server is not responding

Solution:

# Verify server is running
deno task dev

# Or check if port is in use
lsof -ti :8787

"Test timeout" Errors

Problem: Tests timing out

Solution:

  • Increase timeout in test file
  • Check server logs for errors
  • Verify network connectivity
  • Check if server is under load

WebSocket Connection Failures

Problem: WebSocket tests failing

Solution:

# Check if WebSocket endpoint exists
curl -i -N \
  -H "Connection: Upgrade" \
  -H "Upgrade: websocket" \
  http://localhost:8787/ws/compile

# Verify wrangler.toml has WebSocket support

Queue Tests Failing

Problem: Queue tests returning unexpected errors

Solution:

  • Local development: 500 is expected (queues not configured)
  • Production: Verify queue bindings in wrangler.toml
  • Check Cloudflare dashboard for queue configuration

Support

For issues or questions:

  1. Check the main README
  2. Review test output for specific error messages
  3. Verify server is running and accessible
  4. Check that all dependencies are installed

Database Setup

Documentation for database architecture, setup, and backend evaluation.

Contents

Quick Start

# Start local PostgreSQL with Docker
bash quickstart.sh

Database Architecture

Visual reference for the multi-tier storage architecture introduced in Phase 1 of the PlanetScale PostgreSQL + Cloudflare Hyperdrive integration.

Table of Contents


Storage Tier Overview

The system uses four storage tiers arranged by access latency and role:

flowchart TB
    subgraph "Cloudflare Worker"
        W[Worker Request Handler]
    end

    subgraph "L0 · KV — Hot Cache (1–5 ms)"
        KV_CACHE[(COMPILATION_CACHE)]
        KV_METRICS[(METRICS)]
        KV_RATE[(RATE_LIMIT)]
    end

    subgraph "L1 · D1 — Edge SQLite (1–10 ms)"
        D1[(D1 SQLite\nstructured cache)]
    end

    subgraph "L2 · Hyperdrive → PlanetScale PostgreSQL (20–80 ms)"
        HD[Hyperdrive\nconnection pool]
        PG[(PlanetScale\nPostgreSQL\nsource of truth)]
        HD --> PG
    end

    subgraph "Blob · R2 (5–50 ms)"
        R2[(FILTER_STORAGE\ncompiled outputs\n& raw content)]
    end

    W -->|cache lookup| KV_CACHE
    W -->|structured cache| D1
    W -->|relational queries| HD
    W -->|large blobs| R2

    style KV_CACHE fill:#fff9c4,stroke:#fbc02d
    style KV_METRICS fill:#fff9c4,stroke:#fbc02d
    style KV_RATE fill:#fff9c4,stroke:#fbc02d
    style D1 fill:#c8e6c9,stroke:#388e3c
    style HD fill:#e1f5ff,stroke:#0288d1
    style PG fill:#e1f5ff,stroke:#0288d1
    style R2 fill:#f3e5f5,stroke:#7b1fa2
TierBindingTechnologyRole
L0COMPILATION_CACHE, METRICS, RATE_LIMITCloudflare KVHot-path key-value cache
L1DBCloudflare D1 (SQLite)Edge read cache for structured lookups
L2HYPERDRIVEHyperdrive → PlanetScale PostgreSQLPrimary relational store (source of truth)
BlobFILTER_STORAGECloudflare R2Large compiled outputs, raw filter content

Request Data Flow

Current behaviour (Phase 1)

The compile handler today only consults the KV cache (COMPILATION_CACHE). D1, PostgreSQL, and R2 are not in the hot compile path yet:

flowchart TD
    REQ([Incoming Request\nPOST /compile]) --> KV_CHECK{L0 KV\ncache hit?}

    KV_CHECK -->|Hit| RETURN_KV([Return cached result\n~1–5 ms])
    KV_CHECK -->|Miss| DO_COMPILE[Run in-memory\ntransformation pipeline]
    DO_COMPILE --> KV_WRITE[L0: SET compiled result\nin COMPILATION_CACHE\nTTL 60 s]
    KV_WRITE --> RESPOND([Return response])

    style RETURN_KV fill:#fff9c4,stroke:#fbc02d

Target behaviour (Phase 5 — planned)

Once the full Hyperdrive/R2 integration is complete (Phases 2–5), the flow will traverse all storage tiers:

flowchart TD
    REQ([Incoming Request]) --> AUTH{Authenticated?}
    AUTH -->|No| REJECT([401 Unauthorized])
    AUTH -->|Yes| KV_CHECK{L0 KV\ncache hit?}

    KV_CHECK -->|Hit| RETURN_KV([Return cached result\n~1–5 ms])

    KV_CHECK -->|Miss| D1_CHECK{L1 D1\ncache hit?}

    D1_CHECK -->|Hit| RETURN_D1([Return result\npopulate L0 KV\n~1–10 ms])

    D1_CHECK -->|Miss| PG_META[L2: Query PlanetScale\nfor filter metadata]
    PG_META --> R2_READ[Blob: Read compiled\noutput from R2]
    R2_READ --> COMPILE{Needs\nrecompile?}

    COMPILE -->|No| SERVE_CACHED[Serve existing\ncompiled output]
    COMPILE -->|Yes| DO_COMPILE[Run compilation\npipeline]

    DO_COMPILE --> R2_WRITE[Blob: Write new\ncompiled output to R2]
    R2_WRITE --> PG_WRITE[L2: Write metadata\n+ CompilationEvent to PG]
    PG_WRITE --> D1_WRITE[L1: Update D1\ncache entry]
    D1_WRITE --> KV_WRITE[L0: Store result\nin KV cache]
    KV_WRITE --> RESPOND([Return response])

    SERVE_CACHED --> KV_WRITE

    style RETURN_KV fill:#fff9c4,stroke:#fbc02d
    style RETURN_D1 fill:#c8e6c9,stroke:#388e3c
    style PG_META fill:#e1f5ff,stroke:#0288d1
    style PG_WRITE fill:#e1f5ff,stroke:#0288d1
    style R2_READ fill:#f3e5f5,stroke:#7b1fa2
    style R2_WRITE fill:#f3e5f5,stroke:#7b1fa2
    style REJECT fill:#ffcdd2,stroke:#d32f2f

Write Path

Current behaviour (Phase 1)

Today POST /compile writes only to the KV cache:

sequenceDiagram
    participant C as Client
    participant W as Worker
    participant KV as L0 KV (COMPILATION_CACHE)

    C->>W: POST /compile (with filter sources)

    Note over W: Run in-memory transformation pipeline<br/>and compile filter list

    W->>KV: SET compiled result (TTL 60s)
    W-->>C: 200 OK (compiled filter list)

Target behaviour (Phase 5 — planned)

Once Phase 2–5 are implemented, writes will propagate through all tiers:

sequenceDiagram
    participant C as Client
    participant W as Worker
    participant PG as L2 PostgreSQL
    participant R2 as Blob R2
    participant D1 as L1 D1
    participant KV as L0 KV

    C->>W: POST /compile (with filter sources)
    W->>PG: Read FilterSource + latest version metadata
    PG-->>W: metadata, r2_key
    W->>R2: GET compiled blob (r2_key)
    R2-->>W: compiled content

    Note over W: Run transformation pipeline if stale

    W->>R2: PUT new compiled blob → new r2_key
    W->>PG: INSERT CompiledOutput (config_hash, r2_key, rule_count)
    W->>PG: INSERT CompilationEvent (duration_ms, cache_hit)
    W->>D1: UPSERT cache entry (TTL 60–300s)
    W->>KV: SET cached result (TTL 60s)
    W-->>C: 200 OK (compiled filter list)

Authentication Flow

API key authentication as implemented in worker/middleware/auth.ts (authenticateRequest):

flowchart TD
    REQ([Request]) --> HAS_BEARER{Authorization header\nwith Bearer token?}

    HAS_BEARER -->|Yes| HAS_HD{Hyperdrive binding\navailable?}
    HAS_HD -->|No| ADMIN_HEADER
    HAS_HD -->|Yes| EXTRACT[Extract token\nfrom Authorization header]
    EXTRACT --> HASH[SHA-256 hash\nthe raw token]
    HASH --> PG_LOOKUP[L2: SELECT api_keys\nWHERE key_hash = $1]

    PG_LOOKUP --> FOUND{Key found?}
    FOUND -->|No| REJECT([401 Unauthorized])

    FOUND -->|Yes| REVOKED{revoked_at\nIS NULL?}
    REVOKED -->|No| REJECT
    REVOKED -->|Yes| EXPIRY{expires_at\nin the future\nor NULL?}
    EXPIRY -->|Expired| REJECT
    EXPIRY -->|Valid| SCOPE[Validate request\nscope vs key scopes]
    SCOPE -->|Insufficient| REJECT403([403 Forbidden])
    SCOPE -->|OK| UPDATE_USED[Fire-and-forget:\nUPDATE last_used_at]
    UPDATE_USED --> PROCEED([Proceed with request])

    HAS_BEARER -->|No| ADMIN_HEADER{X-Admin-Key\nheader present?}
    ADMIN_HEADER -->|No| REJECT
    ADMIN_HEADER -->|Yes| ADMIN_MATCH{X-Admin-Key equals\nstatic ADMIN_KEY?}
    ADMIN_MATCH -->|No| REJECT
    ADMIN_MATCH -->|Yes| ADMIN_OK([Proceed as admin])

    style REJECT fill:#ffcdd2,stroke:#d32f2f
    style REJECT403 fill:#ffcdd2,stroke:#d32f2f
    style PROCEED fill:#c8e6c9,stroke:#388e3c
    style ADMIN_OK fill:#c8e6c9,stroke:#388e3c

Header routing: Bearer token → Hyperdrive API key auth. No Bearer token (or no Hyperdrive binding) → X-Admin-Key static key fallback.


D1 → PostgreSQL Migration Flow

One-time migration from the legacy D1 SQLite store to PlanetScale PostgreSQL:

flowchart TD
    START([POST /admin/migrate/d1-to-pg]) --> DRY{?dryRun\n= true?}

    DRY -->|Yes| COUNT[Query D1 row counts\nper table]
    COUNT --> DRY_RESP([Return counts\nno writes])

    DRY -->|No| TABLES[Resolve tables to migrate\nstorage_entries, filter_cache,\ncompilation_metadata]
    TABLES --> BATCH_LOOP[For each table:\nfetch 100 rows at a time]

    BATCH_LOOP --> READ_D1[Read batch from D1]
    READ_D1 --> INSERT_PG[INSERT INTO pg\nON CONFLICT DO NOTHING]
    INSERT_PG --> MORE{More rows?}
    MORE -->|Yes| READ_D1
    MORE -->|No| NEXT_TABLE{More tables?}
    NEXT_TABLE -->|Yes| BATCH_LOOP
    NEXT_TABLE -->|No| DONE([Return migration summary\nrows migrated per table])

    style DRY_RESP fill:#fff9c4,stroke:#fbc02d
    style DONE fill:#c8e6c9,stroke:#388e3c

Idempotent: ON CONFLICT DO NOTHING means the migration can be run multiple times safely — only missing rows are inserted.


Local vs Production Connection Routing

How the worker resolves its database connection string depending on the environment:

flowchart LR
    subgraph "Production (Cloudflare Workers)"
        PROD_W[Worker] -->|env.HYPERDRIVE\n.connectionString| HD_PROD[Hyperdrive\nconnection pool]
        HD_PROD --> PS[(PlanetScale\nPostgreSQL)]
    end

    subgraph "Local Dev (wrangler dev)"
        LOCAL_W[Worker] -->|WRANGLER_HYPERDRIVE_LOCAL_CONNECTION_STRING_HYPERDRIVE\nfrom .env.local| LOCAL_PG[(Local PostgreSQL\nDocker / native)]
    end

    subgraph "Prisma CLI (migrations)"
        PRISMA[npx prisma migrate] -->|DIRECT_DATABASE_URL\nor DATABASE_URL\nfrom .env.local| LOCAL_PG
    end

    style HD_PROD fill:#e1f5ff,stroke:#0288d1
    style PS fill:#e1f5ff,stroke:#0288d1
    style LOCAL_PG fill:#c8e6c9,stroke:#388e3c

Set credentials in .env.local (gitignored). See .env.example and local-dev.md.


Schema Relationships

Core PostgreSQL model relationships derived from prisma/schema.prisma. Field names reflect the underlying database column names (snake_case); Prisma model field names are the camelCase equivalents (e.g., display_namedisplayName).

erDiagram
    User {
        uuid id PK
        string email
        string display_name
        string role
        timestamp created_at
        timestamp updated_at
    }

    ApiKey {
        uuid id PK
        uuid user_id FK
        string key_hash
        string key_prefix
        string name
        string[] scopes
        int rate_limit_per_minute
        timestamp last_used_at
        timestamp expires_at
        timestamp revoked_at
        timestamp created_at
        timestamp updated_at
    }

    Session {
        uuid id PK
        uuid user_id FK
        string token_hash
        string ip_address
        string user_agent
        timestamp expires_at
        timestamp created_at
    }

    FilterSource {
        uuid id PK
        string url
        string name
        string description
        boolean is_public
        string owner_user_id
        int refresh_interval_seconds
        int consecutive_failures
        string status
        timestamp last_checked_at
        timestamp created_at
        timestamp updated_at
    }

    FilterListVersion {
        uuid id PK
        uuid source_id FK
        string content_hash
        int rule_count
        string etag
        string r2_key
        boolean is_current
        timestamp fetched_at
        timestamp expires_at
    }

    CompiledOutput {
        uuid id PK
        string config_hash
        string config_name
        json config_snapshot
        int rule_count
        int source_count
        int duration_ms
        string r2_key
        string owner_user_id
        timestamp created_at
        timestamp expires_at
    }

    CompilationEvent {
        uuid id PK
        uuid compiled_output_id FK
        string user_id
        string api_key_id
        string request_source
        string worker_region
        int duration_ms
        boolean cache_hit
        string error_message
        timestamp created_at
    }

    SourceHealthSnapshot {
        uuid id PK
        uuid source_id FK
        string status
        int total_attempts
        int successful_attempts
        int failed_attempts
        int consecutive_failures
        float avg_duration_ms
        float avg_rule_count
        timestamp recorded_at
    }

    SourceChangeEvent {
        uuid id PK
        uuid source_id FK
        string previous_version_id
        string new_version_id
        int rule_count_delta
        boolean content_hash_changed
        timestamp detected_at
    }

    User ||--o{ ApiKey : "owns"
    User ||--o{ Session : "has"
    FilterSource ||--o{ FilterListVersion : "has versions"
    FilterSource ||--o{ SourceHealthSnapshot : "monitored by"
    FilterSource ||--o{ SourceChangeEvent : "changes tracked by"
    CompiledOutput ||--o{ CompilationEvent : "recorded in"

References

Database Evaluation: PlanetScale vs Neon vs Cloudflare vs Prisma

Goal: Evaluate PostgreSQL-compatible database vendors and design a relational schema to replace/complement the current Cloudflare R2 + D1 storage system.


Table of Contents

  1. Current State
  2. What a Better Backend Could Unlock
  3. Vendor Evaluation
  4. Head-to-Head Comparison
  5. Proposed Database Design
  6. Recommended Architecture
  7. Cloudflare Hyperdrive Integration
  8. Migration Plan
  9. Proposed PostgreSQL Schema
  10. References

Current State

The adblock-compiler uses three distinct storage mechanisms:

StorageTechnologyPurposeLocation
Cloudflare D1SQLite at edgeFilter cache, compilation metadata, health metricsEdge (Workers)
Cloudflare R2Object storage (S3-compatible)Large filter list blobs, output artifactsEdge (object store)
Prisma/SQLiteSQLite via Prisma ORMLocal dev storage, same schema as D1Local / Node.js / Deno

Hyperdrive is already configured in wrangler.toml with a binding (HYPERDRIVE) but no target database yet:

[[hyperdrive]]
binding = "HYPERDRIVE"
id = "126a652809674e4abc722e9777ee4140"
localConnectionString = "postgres://username:password@127.0.0.1:5432/database"

Current Limitations

LimitationImpact
D1 is SQLite — no real concurrent writesCannot scale beyond a single Worker's D1 replica
D1 max row size: 1 MBLarge filter lists cannot be stored as single rows
R2 has no query capabilityCannot filter, sort, or aggregate stored lists
No authentication systemNo per-user API keys, rate limiting per account, or admin roles
No shared state between deploymentsEach Worker region may see different data
No schema validation at the DB levelBusiness rules enforced only in TypeScript code
SQLite lacks advanced indexingFull-text search, JSONB queries, pg_vector extensions not available

What a Better Backend Could Unlock

Moving to a shared relational PostgreSQL database (e.g., via Neon + Hyperdrive) would enable:

  1. User authentication — API keys, JWT sessions, OAuth. Users could save filter list configurations, track compilation history, and have per-account rate limits.
  2. Shared blocklist registry — Store popular/community filter lists in the database. Workers query and serve them without downloading from upstream every time.
  3. Real-time analytics — Aggregate compile counts, rule counts, latency distributions across all Workers using proper SQL aggregations.
  4. Full-text search — Search through filter rules, source URLs, or configuration names using PostgreSQL tsvector.
  5. Admin dashboard backend — Persist admin-managed settings, feature flags, and overrides across regions.
  6. Row-level security — Tenant isolation for a future multi-tenant SaaS offering.
  7. Branching / staging environments — Neon's branch-per-environment feature maps perfectly to the existing development, staging, and production Cloudflare environments.

Vendor Evaluation

Cloudflare D1 (current edge database)

D1 is Cloudflare's managed SQLite service that runs at the edge. It replicates reads globally while writes go to a primary location.

Pros

  • ✅ Zero additional infrastructure — runs natively inside Cloudflare Workers
  • ✅ No connection overhead — native binding (env.DB)
  • ✅ Global read replication (SQLite replicated to ~300 PoPs)
  • ✅ Free tier: 5 million rows read/day, 100k writes/day, 5 GB storage
  • ✅ Familiar SQL syntax
  • ✅ Prisma D1 adapter available (@prisma/adapter-d1)
  • ✅ Already in use — schema exists, migrations applied

Cons

  • ❌ SQLite — no real PostgreSQL features (JSONB, arrays, extensions, pg_vector)
  • ❌ 1 MB max row size — large filter lists require chunking
  • ❌ Write-path latency — writes go to a single primary (up to 70–100 ms from edge)
  • ❌ 10 GB max database size per database
  • ❌ No concurrent write transactions (single-writer model)
  • ❌ No authentication at DB level (no row-level security, no roles)
  • ❌ Limited aggregation / window functions compared to PostgreSQL

Best for: Edge-local caching, ephemeral session state, hot-path lookups where read latency matters most.


Cloudflare R2 (current object storage)

R2 is Cloudflare's S3-compatible object storage with no egress fees.

Pros

  • ✅ No egress fees (unlike AWS S3)
  • ✅ S3-compatible API
  • ✅ Excellent for large binary blobs (full compiled filter lists, backups)
  • ✅ Already used for FILTER_STORAGE binding
  • ✅ Free tier: 10 GB storage, 1M Class-A operations/month

Cons

  • ❌ Object store only — no SQL, no query capability
  • ❌ Cannot query contents — must know the exact key
  • ❌ Not suitable as a primary relational database
  • ❌ Metadata is limited (only HTTP headers / custom metadata per object)

Best for: Storing compiled filter list artifacts (.txt blobs), backup snapshots. Keep R2 even after migrating to PostgreSQL.


Cloudflare Hyperdrive

Hyperdrive is not a database — it is a connection accelerator and query result caching layer that sits between Cloudflare Workers and any external PostgreSQL (or MySQL) database.

Cloudflare Worker
    ↓  (standard pg connection string)
Hyperdrive
    ↓  (pooled, geographically distributed)
PostgreSQL database (Neon / Supabase / self-hosted)

How it helps

  • Connection pooling — PostgreSQL allows ~100–500 max connections; Workers can fan out to thousands. Hyperdrive maintains a connection pool close to your database and reuses connections across requests.
  • Query caching — Non-mutating queries (SELECT) can be cached at the Hyperdrive edge PoP for configurable TTLs, reducing round-trip to the origin database.
  • Lower latency — Without Hyperdrive, a Worker in Europe connecting to a US-east PostgreSQL incurs ~120 ms TCP handshake + TLS. With Hyperdrive, the TLS session is pre-warmed and pooled.

Pros

  • ✅ Works with any standard PostgreSQL wire protocol
  • ✅ Reduces cold-start latency by 2–10×
  • ✅ Transparent to the application — use standard pg client
  • ✅ Already configured in wrangler.toml (binding HYPERDRIVE)
  • ✅ Caches SELECT results at the edge
  • ✅ Pay-per-use, included in Workers Paid plan

Cons

  • ❌ Requires an external PostgreSQL database (it accelerates but does not replace one)
  • ❌ Not available on free Workers plan
  • ❌ Some client libraries need minor adaptation (pg node-postgres works; Prisma requires @prisma/adapter-pg)

Best for: Accelerating connections from Workers to any external PostgreSQL provider (Neon, Supabase, etc.).


Neon — Serverless PostgreSQL

Neon is a serverless PostgreSQL service built on a disaggregated storage architecture. Compute auto-scales to zero when idle.

Pros

  • True PostgreSQL — full compatibility including extensions (pg_vector, pg_trgm, uuid-ossp, PostGIS, etc.)
  • Serverless / auto-suspend — compute pauses when idle, reducing cost during low-traffic periods
  • Branching — create a database branch per feature branch, PR environment, or staging slot (same as git branches)
  • Cloudflare Hyperdrive compatible — standard PostgreSQL wire protocol
  • @neondatabase/serverless WebSocket driver — works directly in Cloudflare Workers without Hyperdrive (useful as a fallback)
  • Prisma support@prisma/adapter-neon available
  • Generous free tier — 512 MB storage, 1 compute unit, unlimited branches
  • Point-in-time restore — up to 30 days (paid plans)
  • Row-level security — PostgreSQL native RLS via roles/policies

Cons

  • ❌ Cold start latency (~100–500 ms on free tier when compute was suspended) — mitigated by Hyperdrive caching
  • ❌ WebSocket driver has some quirks vs. standard pg module
  • ❌ Compute scaling has a ceiling on lower-tier plans
  • ❌ Relatively newer product (launched 2022) compared to established providers

Pricing (2025)

TierStorageComputeCost
Free512 MB0.25 CU, auto-suspend$0/month
Launch10 GB1 CU, auto-suspend$19/month
Scale50 GB4 CU, auto-suspend$69/month

Best for: Projects needing true PostgreSQL on a serverless, low-ops budget. The branching feature maps directly to Cloudflare's multi-environment deployment model.


PlanetScale — Native PostgreSQL

⚠️ Important: PlanetScale launched native PostgreSQL support in 2025 (GA). The original evaluation described PlanetScale as MySQL/Vitess — that is no longer accurate. This section reflects the current PostgreSQL product.

PlanetScale is a managed, horizontally-scalable database platform that now offers native PostgreSQL (versions 17 and 18) in addition to its existing MySQL/Vitess offering. The PostgreSQL product is built on a new architecture ("Neki") purpose-built for PostgreSQL — not a port of Vitess. PlanetScale has an official partnership with Cloudflare, with a co-authored blog post and dedicated integration guides for Hyperdrive + Workers.

Pros

  • True native PostgreSQL (v17 & v18) — not an emulation layer; standard PostgreSQL wire protocol
  • Full PostgreSQL feature set — foreign keys enforced at DB level, JSONB, arrays, window functions, CTEs, stored procedures, triggers, materialized views, full-text search, partitioning
  • PostgreSQL extensions — supports commonly used extensions (uuid-ossp, pg_trgm, etc.)
  • Row-level security — PostgreSQL native RLS via roles and policies
  • Branching — git-style database branching; safe schema migrations via deploy requests (same model as Neon)
  • Zero-downtime schema migrations — online schema changes without table locks
  • Official Cloudflare Workers integration — Cloudflare partnership announcement; dedicated tutorial for PlanetScale Postgres + Hyperdrive + Workers; listed on Cloudflare Workers third-party integrations page
  • Hyperdrive compatible — standard PostgreSQL wire protocol; works directly with the existing HYPERDRIVE binding
  • Standard Prisma support — works with standard @prisma/adapter-pg or @prisma/adapter-neon; no workarounds needed
  • Standard drivers — libpq, node-postgres (pg), psycopg, Deno postgres — all work without modification
  • Import from existing PostgreSQL — supports live import from PostgreSQL v13+
  • High performance — NVMe SSD storage, primary + replica clusters across AZs, automatic failover
  • High write throughput — "Neki" architecture designed for horizontal PostgreSQL scaling

Cons

  • No free tier — PostgreSQL plans start at ~$39/month; no permanent free tier (Neon offers 512 MB free)
  • Newer PostgreSQL product — GA since mid-2025; Neon has a longer track record as a serverless PostgreSQL provider
  • No auto-suspend — unlike Neon, PlanetScale Postgres clusters do not auto-pause when idle; charges accrue even at zero traffic
  • "Neki" sharding still rolling out — horizontal sharding features are in progress; single-node/HA clusters available now
  • Higher cost for small projects — the entry pricing is significantly higher than Neon for low-traffic or development use

Pricing (2025)

TierDescriptionCost
Metal (HA)Primary + 2 replicas, NVMe SSD, 10 GB+ storage~$39–$50/month
Single-nodeNon-HA development option (availability varies)Lower, varies

Best for: Production applications requiring high-availability, high write throughput, zero-downtime migrations, and horizontal scalability, with a preference for Cloudflare's official PlanetScale integration. For projects with a free/low-cost tier requirement, Neon is still preferred.


Prisma ORM

Prisma is an ORM (Object-Relational Mapper) that generates type-safe database clients from a schema file. Prisma is not a database — it works on top of the databases evaluated above.

Pros

  • Already in usePrismaStorageAdapter and D1StorageAdapter both exist
  • Type-safe queries — generated TypeScript client from schema.prisma
  • Multi-database support — same code, different provider (SQLite → PostgreSQL requires only a config change)
  • Migration managementprisma migrate dev generates and applies SQL migrations
  • Prisma Studio — GUI data browser
  • Driver adapters@prisma/adapter-neon, @prisma/adapter-d1, @prisma/adapter-pg for edge runtimes
  • Deno support — via runtime = "deno" in generator config
  • Works with all vendors — PostgreSQL (Neon, PlanetScale, Supabase), SQLite (D1, local)

Cons

  • Prisma Client in Cloudflare Workers — requires driver adapter (@prisma/adapter-neon or @prisma/adapter-pg via Hyperdrive)
  • Bundle size — Prisma Client adds ~300 KB to Worker bundle; use edge-compatible driver adapters
  • Raw SQL sometimes needed — complex PostgreSQL queries (e.g., UPSERT ... RETURNING, CTEs) require prisma.$queryRaw
  • MongoDB has limitations — some Prisma features not supported on MongoDB connector

Recommendation: Keep Prisma as the ORM layer. Use @prisma/adapter-neon or @prisma/adapter-pg (via Hyperdrive) in Workers.


Head-to-Head Comparison

CriterionCloudflare D1Cloudflare R2NeonPlanetScalePrisma
Database typeSQLiteObject storePostgreSQLPostgreSQLORM (any DB)
True PostgreSQL✅ (v17/v18)via adapter
Foreign keysN/A
JSONB columns
ExtensionsN/A✅ (pg_vector, etc.)✅ (pg_trgm, uuid-ossp, etc.)
Row-level securityvia DB
BranchingN/A
Serverless / auto-scale✅ (auto-suspend)✅ (HA clusters)N/A
Auto-suspend (zero-cost idle)N/A
Works in CF Workers✅ (native)✅ (native)✅ (ws driver or Hyperdrive)✅ (Hyperdrive / pg driver)✅ (adapter)
Official CF integration✅ (native)✅ (native)via Hyperdrive✅ (official partnership)N/A
Hyperdrive compatibleN/A
Free tier✅ (generous)✅ (generous)✅ (512 MB)❌ (~$39/mo min)N/A
Max storage10 GB/DBUnlimitedPlan-dependentPlan-dependentN/A
Connection poolingBuilt-inN/ANeon pooler / HyperdriveBuilt-in / HyperdriveN/A
Migration toolingManual SQL / PrismaN/APrisma / raw SQLPrisma / deploy requestsBuilt-in CLI
Latency (from Worker)~0–5 ms (edge)~5–50 ms~20–120 ms + Hyperdrive~20–100 ms + HyperdriveN/A
Best useHot-path edge KVBlob storageServerless primary DB (free tier)High-perf primary DB (production)ORM layer

Proposed Database Design

The following schema design uses PostgreSQL conventions and targets Neon as the primary provider, accessed from Workers via Hyperdrive + Prisma.

Authentication System

An authentication system enables per-user API keys, admin roles, and audit logging.

users
├── id (UUID)
├── email (unique)
├── display_name
├── role (admin | user | readonly)
├── created_at
└── updated_at

api_keys
├── id (UUID)
├── user_id → users.id
├── key_hash (SHA-256 of the raw key — never store plaintext)
├── key_prefix (first 8 chars for display, e.g. "abc12345...")
├── name (human label, e.g. "CI pipeline key")
├── scopes (text[] — e.g. ['compile', 'admin:read'])
├── rate_limit_per_minute
├── last_used_at
├── expires_at (nullable)
├── revoked_at (nullable)
├── created_at
└── updated_at

sessions (for web UI login)
├── id (UUID)
├── user_id → users.id
├── token_hash
├── ip_address
├── user_agent
├── expires_at
└── created_at

Design decisions:

  • Store only the hash of API keys — never plaintext. On creation, return the raw key once to the user.
  • Use PostgreSQL text[] for scopes — avoids a join table for simple RBAC.
  • sessions is for browser sessions (cookie-based); api_keys is for programmatic access.
  • Leverage PostgreSQL row-level security to ensure users can only see their own data.

Blocklist Storage and Caching

Rather than only caching in R2 or D1, persist structured metadata in PostgreSQL with blobs in R2.

filter_sources
├── id (UUID)
├── url (unique) — canonical upstream URL
├── name — human label (e.g. "EasyList")
├── description
├── homepage
├── license
├── is_public (bool) — community-visible or private
├── owner_user_id → users.id (nullable — NULL = system/community)
├── refresh_interval_seconds (e.g. 3600)
├── last_checked_at
├── last_success_at
├── last_failure_at
├── consecutive_failures
├── status (healthy | degraded | unhealthy | unknown)
├── created_at
└── updated_at

filter_list_versions
├── id (UUID)
├── source_id → filter_sources.id
├── content_hash (SHA-256)
├── rule_count
├── etag
├── r2_key — pointer to R2 object containing raw content
├── fetched_at
├── expires_at
└── is_current (bool — latest successful fetch)

compiled_outputs
├── id (UUID)
├── config_hash (SHA-256 of the input IConfiguration JSON)
├── config_name
├── config_snapshot (jsonb — full IConfiguration used)
├── rule_count
├── source_count
├── duration_ms
├── r2_key — pointer to R2 object containing compiled output
├── owner_user_id → users.id (nullable)
├── created_at
└── expires_at (nullable — NULL = permanent)

Design decisions:

  • Raw filter list content lives in R2 (blobs up to gigabytes). PostgreSQL stores metadata and the R2 object key.
  • filter_list_versions tracks every fetch, enabling point-in-time recovery and diffing.
  • compiled_outputs stores the result of each unique compilation (deduplication by config_hash).
  • config_snapshot as jsonb enables querying past configurations.

Compilation History and Metrics

compilation_events
├── id (UUID)
├── compiled_output_id → compiled_outputs.id
├── user_id → users.id (nullable)
├── api_key_id → api_keys.id (nullable)
├── request_source (worker | cli | batch_api)
├── worker_region (e.g. "enam", "weur")
├── client_ip_hash
├── duration_ms
├── cache_hit (bool)
├── error_message (nullable)
└── created_at

-- Materialized view for dashboard analytics
-- CREATE MATERIALIZED VIEW compilation_stats_hourly AS
-- SELECT
--   date_trunc('hour', created_at) AS hour,
--   count(*) AS total,
--   sum(CASE WHEN cache_hit THEN 1 ELSE 0 END) AS cache_hits,
--   avg(duration_ms) AS avg_duration_ms,
--   max(rule_count) AS max_rules
-- FROM compilation_events
-- JOIN compiled_outputs ON ...
-- GROUP BY 1;

Source Health and Change Tracking

source_health_snapshots
├── id (UUID)
├── source_id → filter_sources.id
├── status (healthy | degraded | unhealthy)
├── total_attempts
├── successful_attempts
├── failed_attempts
├── consecutive_failures
├── avg_duration_ms
├── avg_rule_count
└── recorded_at

source_change_events
├── id (UUID)
├── source_id → filter_sources.id
├── previous_version_id → filter_list_versions.id (nullable)
├── new_version_id → filter_list_versions.id
├── rule_count_delta (new - previous)
├── content_hash_changed (bool)
└── detected_at

Summary Recommendation

Use Neon (PostgreSQL) + Cloudflare Hyperdrive + Prisma ORM as the default path, while keeping D1 for hot-path edge caching and R2 for blob storage. PlanetScale PostgreSQL is a strong production alternative with an official Cloudflare partnership — preferred if higher write throughput or HA from day one is required.

Both Neon and PlanetScale now offer native PostgreSQL with Hyperdrive compatibility. The choice between them is primarily cost vs. performance:

Decision factorChoose NeonChoose PlanetScale
Starting costFree tier available (512 MB)~$39/month minimum
Zero idle cost✅ Auto-suspend❌ Charges even at idle
Official CF partnershipVia Hyperdrive docs✅ Official blog + dedicated tutorial
Established track record✅ Mature serverless PostgreSQLPostgreSQL product GA mid-2025
Production HASingle-region primaryMulti-AZ primary + replicas
Write throughputServerlessHigh-performance NVMe
ConcernTechnologyRationale
Primary relational DBNeon (default) or PlanetScaleNeon: free tier, auto-suspend, mature serverless PostgreSQL; PlanetScale: official CF partnership, higher perf, HA from day one
Edge accelerationCloudflare HyperdriveReduces Worker → Neon latency by 2–10×, connection pooling
ORMPrismaAlready integrated, type-safe, Deno + Workers compatible via adapters
Edge hot-path cacheCloudflare D1Sub-5ms lookups for filter cache hits; keep as L1 cache layer
Blob storageCloudflare R2Large compiled outputs, raw filter list content
Local development DBSQLite via PrismaZero-config local dev; switch to PostgreSQL URL for staging/prod

Architecture Diagram

┌─────────────────────────────────────────────────────────────────┐
│                    Cloudflare Worker                            │
│                                                                 │
│  Request                                                        │
│    ↓                                                            │
│  [D1 cache lookup]  ──── HIT ────▶  Return cached result       │
│    ↓ MISS                                                       │
│  [Hyperdrive]  ──────────────────▶  [Neon PostgreSQL]          │
│    ↓                                        ↓                  │
│  [Prisma Client]  ◀──────────────  Query result                │
│    ↓                                                            │
│  [R2]  (fetch blob if needed)                                   │
│    ↓                                                            │
│  [D1 cache write]  (populate L1 cache)                         │
│    ↓                                                            │
│  Return response                                                │
└─────────────────────────────────────────────────────────────────┘

Data Flow by Use Case

OperationL1 (D1)L2 (Hyperdrive → Neon)Blob (R2)
Compile filter list (cache hit)Read
Compile filter list (cache miss)Write (on complete)Read/Write metadataRead blob
Store compiled outputWrite metadataWrite blob
User authenticationRead api_keys
Health monitoringRead/WriteWrite snapshots
Admin dashboardRead aggregates
Analytics queriesRead materialized views

Cloudflare Hyperdrive Integration

Hyperdrive is already configured in wrangler.toml. The steps below show both Neon and PlanetScale options — choose whichever vendor you select.

1. Create Your PostgreSQL Database

Option A — Neon (free tier, auto-suspend)

# Install Neon CLI
npm install -g neonctl

# Create a project
neonctl projects create --name adblock-compiler

# Get connection string
neonctl connection-string --project-id <PROJECT_ID>
# Output: postgres://user:password@ep-xxx.us-east-2.aws.neon.tech/neondb?sslmode=require

Option B — PlanetScale (official Cloudflare partnership)

Create a PostgreSQL database from the PlanetScale dashboard, then copy the connection string from the "Connect" panel (select "Postgres" and "node-postgres").

postgres://user:password@aws.connect.psdb.cloud/adblock?sslmode=require

PlanetScale has a dedicated Cloudflare Workers integration tutorial at: https://planetscale.com/docs/postgres/tutorials/planetscale-postgres-cloudflare-workers

2. Update Hyperdrive with Your Database Connection

# Create Hyperdrive config — works for both Neon and PlanetScale (standard PostgreSQL protocol)
wrangler hyperdrive create adblock-hyperdrive \
  --connection-string="postgres://user:password@<HOST>/<DATABASE>?sslmode=require"

# Note the returned ID and update wrangler.toml

Update wrangler.toml:

[[hyperdrive]]
binding = "HYPERDRIVE"
id = "<NEW_HYPERDRIVE_ID>"
localConnectionString = "postgres://username:password@127.0.0.1:5432/adblock_dev"

3. Install Prisma with PostgreSQL Adapter

Both Neon and PlanetScale use standard PostgreSQL wire protocol, so either adapter works with Hyperdrive:

# For Neon (uses @neondatabase/serverless WebSocket driver)
npm install @prisma/client @prisma/adapter-neon @neondatabase/serverless
npm install -D prisma

# For PlanetScale Postgres or any standard PostgreSQL via Hyperdrive (uses node-postgres)
npm install @prisma/client @prisma/adapter-pg pg
npm install -D prisma

4. Update Prisma Schema for PostgreSQL

Update prisma/schema.prisma to switch the provider:

generator client {
  provider        = "prisma-client-js"
  previewFeatures = ["driverAdapters"]
}

datasource db {
  provider = "postgresql"
  url      = env("DATABASE_URL")
  // For local dev: DATABASE_URL="postgres://user:pass@localhost:5432/adblock"
  // For production: set via wrangler secret put DATABASE_URL
}

5. Use Hyperdrive in the Worker

// worker/worker.ts — Option A: Neon adapter (WebSocket driver)
import { PrismaClient } from '@prisma/client';
import { PrismaNeon } from '@prisma/adapter-neon';
import { neon } from '@neondatabase/serverless';

export interface Env {
    HYPERDRIVE: Hyperdrive;
    DB: D1Database;           // keep for edge caching
    FILTER_STORAGE: R2Bucket; // keep for blob storage
}

function createPrisma(env: Env): PrismaClient {
    // Use Hyperdrive connection string — it handles pooling + caching
    const sql = neon(env.HYPERDRIVE.connectionString);
    const adapter = new PrismaNeon(sql);
    return new PrismaClient({ adapter });
}

export default {
    async fetch(request: Request, env: Env): Promise<Response> {
        const prisma = createPrisma(env);
        // ... use prisma for relational queries
        // ... use env.DB for fast edge caching
        // ... use env.FILTER_STORAGE for blob reads
    },
};
// worker/worker.ts — Option B: node-postgres adapter (PlanetScale or any PostgreSQL via Hyperdrive)
import { PrismaClient } from '@prisma/client';
import { PrismaPg } from '@prisma/adapter-pg';
import { Pool } from 'pg';

function createPrisma(env: Env): PrismaClient {
    const pool = new Pool({ connectionString: env.HYPERDRIVE.connectionString });
    const adapter = new PrismaPg(pool);
    return new PrismaClient({ adapter });
}

6. Configure Hyperdrive Caching

In the Cloudflare dashboard or via API, configure Hyperdrive to cache appropriate queries:

# Enable caching on the Hyperdrive config
wrangler hyperdrive update <HYPERDRIVE_ID> \
  --caching-disabled=false \
  --max-age=60 \  # Cache SELECT results for 60 seconds
  --stale-while-revalidate=15

What to cache vs. skip:

Query typeCache?Reason
SELECT filter list metadata✅ Yes (60s TTL)Rarely changes
SELECT compiled output by hash✅ Yes (300s TTL)Immutable by hash
SELECT user/api_key lookup✅ Yes (30s TTL)Low churn
INSERT/UPDATE compilation events❌ NoWrites bypass cache
SELECT health snapshots✅ Yes (30s TTL)Dashboard data

Migration Plan

Phase 1 — Set Up Infrastructure (Week 1)

  • Select primary vendor: Neon (free tier / serverless) or PlanetScale (official CF partnership / HA)
  • Create database project and production branch
  • Configure development and production branches
  • Update Hyperdrive config with connection string: wrangler hyperdrive update <ID> --connection-string="..."
  • Set DATABASE_URL secret in Cloudflare: wrangler secret put DATABASE_URL
  • Update wrangler.toml with the correct Hyperdrive ID

Phase 2 — PostgreSQL Schema (Week 1–2)

  • Update prisma/schema.prisma provider to postgresql
  • Add new models: users, api_keys, sessions, filter_sources, filter_list_versions, compiled_outputs, compilation_events
  • Run npx prisma migrate dev --name init_postgresql
  • Apply migration to Neon dev branch: npx prisma migrate deploy
  • Update .env.development with Neon dev branch connection string

Phase 3 — Update Storage Adapters (Week 2–3)

  • Create src/storage/NeonStorageAdapter.ts implementing IStorageAdapter via Prisma + Neon adapter
  • Update PrismaStorageAdapter to support both SQLite (local dev) and PostgreSQL (staging/prod) via environment variable
  • Update Worker entry point to use createPrisma(env) with Hyperdrive connection string
  • Add StorageAdapterType = 'neon' alongside existing 'prisma' | 'd1' | 'memory'

Phase 4 — Authentication (Week 3–4)

  • Implement src/services/AuthService.ts — API key creation, validation, hashing (SHA-256)
  • Add middleware to Worker router: validateApiKey(request, env)
  • Expose POST /api/auth/keys — create API key (returns raw key once)
  • Expose DELETE /api/auth/keys/:id — revoke API key
  • Wire user_id into compilation event tracking

Phase 5 — Data Migration (Week 4–5)

  • Export existing D1 data to JSON using wrangler d1 export
  • Write migration script to import into Neon PostgreSQL
  • Validate data integrity after import
  • Run both backends in parallel for one week (D1 as L1 cache, Neon as source of truth)

Phase 6 — Cutover (Week 5–6)

  • Switch primary storage reads/writes to Neon
  • Keep D1 as L1 hot cache (TTL: 60–300 seconds)
  • Keep R2 for blob storage
  • Monitor latency via Cloudflare Analytics + Neon metrics dashboard
  • Remove D1 as primary storage after 1-week validation period

Proposed PostgreSQL Schema

Below is a consolidated SQL schema (compatible with Neon PostgreSQL) combining all proposed tables. Use with prisma migrate or apply directly.

-- Enable UUID generation
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";

-- ============================================================
-- Authentication
-- ============================================================

CREATE TABLE users (
    id          UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
    email       TEXT UNIQUE NOT NULL,
    display_name TEXT,
    role        TEXT NOT NULL DEFAULT 'user' CHECK (role IN ('admin', 'user', 'readonly')),
    created_at  TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at  TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

CREATE TABLE api_keys (
    id                   UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
    user_id              UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE,
    key_hash             TEXT UNIQUE NOT NULL,
    key_prefix           TEXT NOT NULL,
    name                 TEXT NOT NULL,
    scopes               TEXT[] NOT NULL DEFAULT '{"compile"}',
    rate_limit_per_minute INT NOT NULL DEFAULT 60,
    last_used_at         TIMESTAMPTZ,
    expires_at           TIMESTAMPTZ,
    revoked_at           TIMESTAMPTZ,
    created_at           TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at           TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

CREATE INDEX idx_api_keys_user_id ON api_keys(user_id);
CREATE INDEX idx_api_keys_key_hash ON api_keys(key_hash);

CREATE TABLE sessions (
    id          UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
    user_id     UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE,
    token_hash  TEXT UNIQUE NOT NULL,
    ip_address  TEXT,
    user_agent  TEXT,
    expires_at  TIMESTAMPTZ NOT NULL,
    created_at  TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

CREATE INDEX idx_sessions_token_hash ON sessions(token_hash);
CREATE INDEX idx_sessions_user_id    ON sessions(user_id);

-- ============================================================
-- Filter Sources
-- ============================================================

CREATE TABLE filter_sources (
    id                      UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
    url                     TEXT UNIQUE NOT NULL,
    name                    TEXT NOT NULL,
    description             TEXT,
    homepage                TEXT,
    license                 TEXT,
    is_public               BOOLEAN NOT NULL DEFAULT TRUE,
    owner_user_id           UUID REFERENCES users(id) ON DELETE SET NULL,
    refresh_interval_seconds INT NOT NULL DEFAULT 3600,
    last_checked_at         TIMESTAMPTZ,
    last_success_at         TIMESTAMPTZ,
    last_failure_at         TIMESTAMPTZ,
    consecutive_failures    INT NOT NULL DEFAULT 0,
    status                  TEXT NOT NULL DEFAULT 'unknown'
                                CHECK (status IN ('healthy', 'degraded', 'unhealthy', 'unknown')),
    created_at              TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at              TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

CREATE INDEX idx_filter_sources_status ON filter_sources(status);
CREATE INDEX idx_filter_sources_url    ON filter_sources(url);

CREATE TABLE filter_list_versions (
    id           UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
    source_id    UUID NOT NULL REFERENCES filter_sources(id) ON DELETE CASCADE,
    content_hash TEXT NOT NULL,
    rule_count   INT NOT NULL,
    etag         TEXT,
    r2_key       TEXT NOT NULL,
    fetched_at   TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    expires_at   TIMESTAMPTZ,
    is_current   BOOLEAN NOT NULL DEFAULT FALSE
);

CREATE UNIQUE INDEX idx_filter_list_versions_current
    ON filter_list_versions(source_id) WHERE is_current = TRUE;
CREATE INDEX idx_filter_list_versions_source ON filter_list_versions(source_id);
CREATE INDEX idx_filter_list_versions_hash   ON filter_list_versions(content_hash);

-- ============================================================
-- Compiled Outputs
-- ============================================================

CREATE TABLE compiled_outputs (
    id              UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
    config_hash     TEXT UNIQUE NOT NULL,
    config_name     TEXT NOT NULL,
    config_snapshot JSONB NOT NULL,
    rule_count      INT NOT NULL,
    source_count    INT NOT NULL,
    duration_ms     INT NOT NULL,
    r2_key          TEXT NOT NULL,
    owner_user_id   UUID REFERENCES users(id) ON DELETE SET NULL,
    created_at      TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    expires_at      TIMESTAMPTZ
);

CREATE INDEX idx_compiled_outputs_config_name ON compiled_outputs(config_name);
CREATE INDEX idx_compiled_outputs_created_at  ON compiled_outputs(created_at DESC);
CREATE INDEX idx_compiled_outputs_owner       ON compiled_outputs(owner_user_id);

-- ============================================================
-- Compilation Events (append-only telemetry)
-- ============================================================

CREATE TABLE compilation_events (
    id                  UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
    compiled_output_id  UUID REFERENCES compiled_outputs(id) ON DELETE SET NULL,
    user_id             UUID REFERENCES users(id) ON DELETE SET NULL,
    api_key_id          UUID REFERENCES api_keys(id) ON DELETE SET NULL,
    request_source      TEXT NOT NULL CHECK (request_source IN ('worker', 'cli', 'batch_api', 'workflow')),
    worker_region       TEXT,
    duration_ms         INT NOT NULL,
    cache_hit           BOOLEAN NOT NULL DEFAULT FALSE,
    error_message       TEXT,
    created_at          TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

CREATE INDEX idx_compilation_events_created_at ON compilation_events(created_at DESC);
CREATE INDEX idx_compilation_events_user_id    ON compilation_events(user_id);

-- ============================================================
-- Source Health Tracking
-- ============================================================

CREATE TABLE source_health_snapshots (
    id                   UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
    source_id            UUID NOT NULL REFERENCES filter_sources(id) ON DELETE CASCADE,
    status               TEXT NOT NULL CHECK (status IN ('healthy', 'degraded', 'unhealthy')),
    total_attempts       INT NOT NULL DEFAULT 0,
    successful_attempts  INT NOT NULL DEFAULT 0,
    failed_attempts      INT NOT NULL DEFAULT 0,
    consecutive_failures INT NOT NULL DEFAULT 0,
    avg_duration_ms      FLOAT NOT NULL DEFAULT 0,
    avg_rule_count       FLOAT NOT NULL DEFAULT 0,
    recorded_at          TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

CREATE INDEX idx_source_health_source_id   ON source_health_snapshots(source_id);
CREATE INDEX idx_source_health_recorded_at ON source_health_snapshots(recorded_at DESC);

CREATE TABLE source_change_events (
    id                    UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
    source_id             UUID NOT NULL REFERENCES filter_sources(id) ON DELETE CASCADE,
    previous_version_id   UUID REFERENCES filter_list_versions(id) ON DELETE SET NULL,
    new_version_id        UUID NOT NULL REFERENCES filter_list_versions(id) ON DELETE CASCADE,
    rule_count_delta      INT NOT NULL DEFAULT 0,
    content_hash_changed  BOOLEAN NOT NULL DEFAULT TRUE,
    detected_at           TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

CREATE INDEX idx_source_change_source_id   ON source_change_events(source_id);
CREATE INDEX idx_source_change_detected_at ON source_change_events(detected_at DESC);

References

Local Development Database Setup

Run PostgreSQL locally via Docker. No installation needed.

# Start PostgreSQL 18 in Docker
docker run -d \
  --name adblock-postgres \
  -e POSTGRES_USER=<user> \
  -e POSTGRES_PASSWORD=<password> \
  -e POSTGRES_DB=adblock_dev \
  -p 5432:5432 \
  postgres:18-alpine

# Verify it's running
docker ps | grep adblock-postgres

Connection string: postgresql://<user>:<password>@127.0.0.1:5432/adblock_dev

See .env.example for the variable names to set in .env.local.

Docker Compose (alternative)

Add to a docker-compose.yml at the project root:

services:
  postgres:
    image: postgres:18-alpine
    ports:
      - "5432:5432"
    environment:
      POSTGRES_USER: <user>
      POSTGRES_PASSWORD: <password>
      POSTGRES_DB: adblock_dev
    volumes:
      - pgdata:/var/lib/postgresql/data

volumes:
  pgdata:
docker compose up -d

Option B: Native PostgreSQL (macOS)

# Install via Homebrew
brew install postgresql@18

# Start the service
brew services start postgresql@18

# Create the development database and user
createdb adblock_dev
createuser <user> --createdb
psql -c "ALTER USER <user> PASSWORD '<password>';"

Connection string: postgresql://<user>:<password>@127.0.0.1:5432/adblock_dev

Configure Environment

Set DATABASE_URL in your .env.local (not committed to git):

# Copy the example file and fill in your local credentials
cp .env.example .env.local
# Then edit .env.local and set:
# DATABASE_URL="postgresql://<user>:<password>@127.0.0.1:5432/adblock_dev"
# DIRECT_DATABASE_URL="postgresql://<user>:<password>@127.0.0.1:5432/adblock_dev"

The .envrc file loads .env.local automatically via direnv.

Apply Migrations

# Generate Prisma client + apply migrations
npx prisma migrate dev

# Or just apply existing migrations without creating new ones
npx prisma migrate deploy

# Open Prisma Studio to browse data
npx prisma studio

Seed Data (optional)

# Seed with sample filter sources
npx prisma db seed

Wrangler Local Dev

Wrangler uses the WRANGLER_HYPERDRIVE_LOCAL_CONNECTION_STRING_HYPERDRIVE env var (or the localConnectionString placeholder in wrangler.toml) for the Hyperdrive binding during wrangler dev. Set the real value in .env.local:

# .env.local (gitignored)
WRANGLER_HYPERDRIVE_LOCAL_CONNECTION_STRING_HYPERDRIVE="postgresql://<user>:<password>@127.0.0.1:5432/adblock_dev"

When you run deno task wrangler:dev (which calls wrangler dev), the Hyperdrive binding resolves to your local PostgreSQL instance.

Switching Environments

EnvironmentDATABASE_URLHow
Local devpostgresql://<user>:<password>@localhost:5432/adblock_dev.env.local
CI/stagingPlanetScale development branch connection stringGitHub Actions secret
ProductionPlanetScale main branch connection stringwrangler secret put DATABASE_URL

The Prisma schema provider is always postgresql — only the connection string changes.

Troubleshooting

"Connection refused" on port 5432:

  • Docker: docker ps to verify the container is running
  • Native: brew services list to check PostgreSQL status

"Database does not exist":

  • Run createdb adblock_dev or restart the Docker container

Prisma migration errors:

  • npx prisma migrate reset to drop and recreate the database (destructive!)
  • Check that DATABASE_URL in .env.local is correct

Modern PostgreSQL Practices

Target: PostgreSQL 18+ (PlanetScale native PostgreSQL)

Extensions

PlanetScale PostgreSQL supports commonly used extensions. The schema leverages:

ExtensionPurposeUsed For
pgcryptoUUID generationPrimary keys (gen_random_uuid())
pg_trgmTrigram similarityFuture: fuzzy search on filter rule content

Enable in a migration:

CREATE EXTENSION IF NOT EXISTS "pgcrypto";
CREATE EXTENSION IF NOT EXISTS "pg_trgm";

Schema Design Practices

UUID Primary Keys

All tables use UUID primary keys instead of auto-incrementing integers:

  • No sequential enumeration attacks
  • Safe for distributed inserts (Workers in multiple regions)
  • Mergeable across database branches without ID conflicts

JSONB for Flexible Data

compiled_outputs.config_snapshot uses JSONB:

  • Query individual fields: WHERE config_snapshot->>'name' = 'EasyList'
  • Index specific paths: CREATE INDEX ON compiled_outputs ((config_snapshot->>'name'))
  • No schema migration needed when config shape evolves

PostgreSQL Arrays

api_keys.scopes uses TEXT[] (native array):

  • Check scope: WHERE 'compile' = ANY(scopes)
  • No join table needed for simple RBAC
  • Indexable with GIN: CREATE INDEX ON api_keys USING GIN(scopes)

Partial Unique Indexes

filter_list_versions enforces "at most one current version per source" via a partial unique index (applied as a raw SQL migration, since Prisma does not support partial indexes in the schema DSL):

CREATE UNIQUE INDEX idx_filter_list_versions_current
    ON filter_list_versions(source_id)
    WHERE is_current = TRUE;

This allows unlimited historical (non-current) versions while still guaranteeing uniqueness for the active version. It is a PostgreSQL-specific feature that SQLite and MySQL don't support.

Timestamptz

All timestamp columns use TIMESTAMPTZ (timestamp with time zone) instead of TIMESTAMP:

  • Stores in UTC internally, converts to client timezone on read
  • Prevents timezone confusion between Workers in different regions
  • PostgreSQL best practice since v8.0

Performance Settings

Connection Pooling

PlanetScale provides built-in connection pooling. Hyperdrive adds a second layer of edge-side pooling. No need for PgBouncer or similar.

Recommended Hyperdrive caching:

wrangler hyperdrive update <ID> \
    --caching-disabled=false \
    --max-age=60 \
    --stale-while-revalidate=15

Indexes

The schema includes targeted indexes for the most common query patterns:

  • api_keys(key_hash) — API key lookup on every authenticated request
  • compilation_events(created_at DESC) — Dashboard analytics, most recent first
  • filter_sources(status) — Health monitoring queries
  • compiled_outputs(config_hash) — Cache deduplication by configuration

Append-Only Tables

compilation_events and source_health_snapshots are append-only (no UPDATEs). This is optimal for:

  • Write performance (no row locking contention)
  • Time-series analytics (partition by month if volume grows)
  • Audit trail (immutable history)

Future optimization: partition by created_at month if table exceeds 10M rows.

Security

Row-Level Security (Future)

PostgreSQL supports RLS for multi-tenant isolation:

ALTER TABLE compiled_outputs ENABLE ROW LEVEL SECURITY;

CREATE POLICY user_owns_output ON compiled_outputs
    USING (owner_user_id = current_setting('app.current_user_id')::uuid);

This is planned for Phase 4 (authentication) when per-user data isolation is needed.

Credential Storage

  • API keys: only the SHA-256 hash is stored (key_hash), never plaintext
  • Sessions: only the token hash is stored (token_hash)
  • The key_prefix (first 8 chars) allows users to identify keys in the UI

References

Prisma ORM Evaluation for Storage Classes

Overview

This document evaluates the storage backend options for the adblock-compiler project. Prisma ORM with SQLite is now the default storage backend.

Prisma Supported Databases

Prisma is a next-generation ORM for Node.js and TypeScript that supports the following databases:

Relational Databases (SQL)

DatabaseStatusNotes
PostgreSQLFull SupportPrimary recommendation for production
MySQLFull SupportIncluding MySQL 5.7+
MariaDBFull SupportMySQL-compatible
SQLiteFull SupportGreat for local development/embedded
SQL ServerFull SupportMicrosoft SQL Server 2017+
CockroachDBFull SupportDistributed SQL database

NoSQL Databases

DatabaseStatusNotes
MongoDBFull SupportSpecial connector with some limitations

Cloud Database Integrations

ProviderStatusNotes
SupabaseSupportedPostgreSQL-based
PlanetScaleSupportedMySQL-compatible
TursoSupportedSQLite edge database
Cloudflare D1SupportedSQLite at the edge
NeonSupportedServerless PostgreSQL

Upcoming Features (2025)

  • PostgreSQL extensions support (PGVector, Full-Text Search via ParadeDB)
  • Prisma 7 major release with modernized foundations

Current Implementation Analysis

Current Architecture: Prisma with SQLite

The project uses Prisma ORM with SQLite as the default storage backend:

PrismaStorageAdapter (SQLite/PostgreSQL/MySQL)
├── CachingDownloader
│   ├── ChangeDetector
│   └── SourceHealthMonitor
└── IncrementalCompiler (MemoryCacheStorage)

Key Characteristics:

  • Flexible database support (SQLite default, PostgreSQL, MySQL, etc.)
  • Cross-runtime compatibility (Node.js, Deno, Bun)
  • Hierarchical keys: ['cache', 'filters', source]
  • Application-level TTL support
  • Type-safe generic operations

Storage Classes Summary

ClassPurposeComplexity
PrismaStorageAdapterCore KV operationsLow
D1StorageAdapterCloudflare edge storageLow
CachingDownloaderSmart download cachingMedium
ChangeDetectorTrack filter changesLow
SourceHealthMonitorTrack source reliabilityLow
IncrementalCompilerCompilation cachingMedium

Comparison: Prisma SQLite vs Other Options

Feature Comparison

FeaturePrisma/SQLitePrisma/PostgreSQLCloudflare D1
Schema DefinitionPrisma SchemaPrisma SchemaSQL
Type SafetyGenerated typesGenerated typesManual
QueriesRich query APIRich query APIRaw SQL
RelationsFirst-classFirst-classManual
MigrationsBuilt-inBuilt-inManual
TTL SupportApplication-levelApplication-levelApplication-level
TransactionsFull ACIDFull ACIDLimited
ToolingPrisma StudioPrisma StudioWrangler CLI
RuntimeAllAllWorkers only
InfrastructureNone (embedded)Server requiredEdge

Pros and Cons

Prisma with SQLite (Default)

Pros:

  • Zero infrastructure overhead
  • Cross-runtime compatibility (Node.js, Deno, Bun)
  • Simple API for KV operations
  • Works offline/locally
  • Type-safe with generated client
  • Built-in migrations and schema management
  • Excellent tooling (Prisma Studio, CLI)
  • Fast for simple operations

Cons:

  • Single-instance only (no shared database)
  • TTL must be implemented in application code
  • Not suitable for multi-server deployments

Prisma with PostgreSQL

Pros:

  • Multi-instance support
  • Full ACID transactions
  • Rich query capabilities
  • Production-ready for scaled deployments
  • Same API as SQLite

Cons:

  • Requires database server
  • Additional infrastructure overhead
  • More complex setup

Cloudflare D1

Pros:

  • Edge-first architecture
  • Low latency globally
  • Serverless pricing model
  • No infrastructure management

Cons:

  • Cloudflare Workers only
  • Limited query capabilities
  • Different API from Prisma adapters

Use Case Analysis

Current Use Cases

Use CaseData PatternComplexitySQLite FitPostgreSQL FitD1 Fit
Filter list cachingSimple KV with TTLLowExcellentExcellentGood
Health monitoringAppend-only metricsLowGoodBetterGood
Change detectionSnapshot comparisonLowGoodGoodGood
Compilation historyTime-series queriesMediumGoodBetterGood

When to Use PostgreSQL

PostgreSQL is beneficial if:

  1. Multi-instance deployment - Shared database across servers/workers
  2. Complex queries required - Filtering, aggregation, joins
  3. Data relationships - Related entities need referential integrity
  4. Audit/compliance needs - Full transaction logs, ACID guarantees
  5. High concurrency - Multiple writers accessing the same data

When to Use SQLite (Default)

SQLite remains the best choice when:

  1. Single-instance deployment - One server or local development
  2. Simplicity is paramount - No external infrastructure needed
  3. Local/offline use - Application runs standalone
  4. Minimal maintenance - No database server to manage

When to Use Cloudflare D1

D1 is the best choice when:

  1. Edge deployment - Running on Cloudflare Workers
  2. Global distribution - Need low latency worldwide
  3. Serverless - No infrastructure management desired

Recommendation

Summary

Prisma with SQLite is the default choice for simplicity and zero infrastructure.

The existing storage patterns (caching, health monitoring, change detection) are well-suited to the Prisma adapter pattern. SQLite provides a simple embedded database that requires no external infrastructure.

Architecture

The project uses a flexible adapter pattern:

classDiagram
    class IStorageAdapter {
        +set~T~(key: string[], value: T, ttl?: number) Promise~boolean~
        +get~T~(key: string[]) Promise~StorageEntry~T~ | null~
        +delete(key: string[]) Promise~boolean~
        +list~T~(options) Promise~Array~{ key: string[]; value: StorageEntry~T~ }~~
    }
    IStorageAdapter <|-- PrismaStorageAdapter
    IStorageAdapter <|-- D1StorageAdapter

This allows switching storage backends based on deployment environment without changing application code.

Implementation Status

The project includes:

  1. IStorageAdapter - Abstract interface for storage backends
  2. PrismaStorageAdapter - Default implementation (SQLite/PostgreSQL/MySQL)
  3. D1StorageAdapter - Cloudflare edge deployment
  4. prisma/schema.prisma - Prisma schema (for SQLite/PostgreSQL/MongoDB)

Conclusion

AspectRecommendation
Default UsagePrisma with SQLite
Multi-instancePrisma with PostgreSQL
Edge DeploymentCloudflare D1
MongoDBPrisma with MongoDB connector

The storage abstraction layer enables switching backends based on deployment requirements without affecting the application code.

References

Deployment

Guides for deploying the Adblock Compiler to various platforms.

Contents

Quick Start

# Using Docker Compose (recommended)
docker compose up -d

Access the web UI at http://localhost:8787

Cloudflare Containers Deployment Guide

This guide explains how to deploy the Adblock Compiler to Cloudflare Containers.

Overview

Cloudflare Containers allows you to deploy Docker containers globally alongside your Workers. The container configuration is set up in wrangler.toml and the container image is defined in Dockerfile.container.

Current Configuration

wrangler.toml

[[containers]]
class_name = "AdblockCompiler"
image = "./Dockerfile.container"
max_instances = 5

[[durable_objects.bindings]]
class_name = "AdblockCompiler"
name = "ADBLOCK_COMPILER"

[[migrations]]
new_sqlite_classes = ["AdblockCompiler"]
tag = "v1"

worker/worker.ts

The AdblockCompiler class extends the Container class from @cloudflare/containers:

import { Container } from '@cloudflare/containers';

export class AdblockCompiler extends Container {
    defaultPort = 8787;
    sleepAfter = '10m';

    override onStart() {
        console.log('[AdblockCompiler] Container started');
    }
}

Dockerfile.container

A minimal Deno image that runs worker/container-server.ts — a lightweight HTTP server that handles compilation requests forwarded by the Worker.

Prerequisites

  1. Docker must be running — Wrangler uses Docker to build and push images

    docker info
    

    If this fails, start Docker Desktop or your Docker daemon.

  2. Wrangler authentication — Authenticate with your Cloudflare account:

    deno task wrangler login
    
  3. Container support in your Cloudflare plan — Containers are available on the Workers Paid plan.

Deployment Steps

1. Deploy to Cloudflare

deno task wrangler:deploy

This command will:

  • Build the Docker container image from Dockerfile.container
  • Push the image to Cloudflare's Container Registry (backed by R2)
  • Deploy your Worker with the container binding
  • Configure Cloudflare's network to spawn container instances on-demand

2. Wait for Provisioning

After the first deployment, wait 2–3 minutes before making requests. Unlike Workers, containers take time to be provisioned across the edge network.

3. Check Deployment Status

npx wrangler containers list

This shows all containers in your account and their deployment status.

Local Development

Windows Limitation

Containers are not supported for local development on Windows. You have two options:

  1. Use WSL (Windows Subsystem for Linux)

    wsl
    cd /mnt/d/source/adblock-compiler
    deno task wrangler:dev
    
  2. Disable containers for local dev (current configuration) The wrangler.toml has enable_containers = false in the [dev] section, which allows you to develop the Worker functionality locally without containers.

Local Development Without Containers

You can still test the Worker API locally:

deno task wrangler:dev

Visit http://localhost:8787 to access:

  • /api — API documentation
  • /compile — JSON compilation endpoint
  • /compile/stream — Streaming compilation with SSE
  • /metrics — Request metrics

Note: The ADBLOCK_COMPILER Durable Object binding is available in local dev, but containers are disabled via enable_containers = false in the [dev] section of wrangler.toml.

Container Architecture

The AdblockCompiler class in worker/worker.ts extends the Container base class from @cloudflare/containers, which handles container lifecycle, request proxying, and automatic restart:

import { Container } from '@cloudflare/containers';

export class AdblockCompiler extends Container {
    defaultPort = 8787;
    sleepAfter = '10m';
}

How It Works

  1. A request reaches the Cloudflare Worker (worker/worker.ts)
  2. The Worker passes the request to an AdblockCompiler Durable Object instance
  3. The AdblockCompiler (which extends Container) starts a container instance if one isn't already running
  4. The container (Dockerfile.container) runs worker/container-server.ts — a Deno HTTP server
  5. The server handles the compilation request using WorkerCompiler and returns the result
  6. The container sleeps after 10 minutes of inactivity (sleepAfter = '10m')

Container Server Endpoints

worker/container-server.ts exposes:

MethodPathDescription
GET/healthLiveness probe — returns { status: 'ok' }
POST/compileCompile a filter list, returns plain text

Production Deployment Workflow

  1. Build and test locally (without containers)

    deno task wrangler:dev
    
  2. Test Docker image (optional)

    docker build -f Dockerfile.container -t adblock-compiler-container:test .
    docker run -p 8787:8787 adblock-compiler-container:test
    curl http://localhost:8787/health
    
  3. Deploy to Cloudflare

    deno task wrangler:deploy
    
  4. Check deployment status

    npx wrangler containers list
    
  5. Monitor logs

    deno task wrangler:tail
    

Container Configuration Options

Scaling

[[containers]]
class_name = "AdblockCompiler"
image = "./Dockerfile.container"
max_instances = 5  # Maximum concurrent container instances

Sleep Timeout

Configured in worker/worker.ts on the AdblockCompiler class:

sleepAfter = '10m';  // Stop the container after 10 minutes of inactivity

Bindings Available

The container/worker has access to:

  • env.COMPILATION_CACHE — KV Namespace for caching compiled results
  • env.RATE_LIMIT — KV Namespace for rate limiting
  • env.METRICS — KV Namespace for metrics storage
  • env.FILTER_STORAGE — R2 Bucket for filter list storage
  • env.ASSETS — Static assets (HTML, CSS, JS)
  • env.COMPILER_VERSION — Version string
  • env.ADBLOCK_COMPILER — Durable Object binding to container

Cost Considerations

  • Containers are billed per millisecond of runtime (10ms granularity)
  • Automatically scale to zero when not in use (sleepAfter = '10m')
  • No charges when idle
  • Container registry storage is free (backed by R2)

Troubleshooting

Docker not running

Error: Docker is not running

Solution: Start Docker Desktop and run docker info to verify.

Container won't provision

Error: Container failed to start

Solution:

  1. Check npx wrangler containers list for status
  2. Check container logs with deno task wrangler:tail
  3. Verify Dockerfile.container builds locally: docker build -f Dockerfile.container -t test .

Module not found errors

If you see Cannot find module '@cloudflare/containers':

Solution: Run pnpm install to install the @cloudflare/containers package.

Next Steps

  1. Deploy to production:

    deno task wrangler:deploy
    
  2. Set up custom domain (optional)

    npx wrangler deployments domains add <your-domain>
    
  3. Monitor performance

    deno task wrangler:tail
    
  4. Update container configuration as needed in wrangler.toml and worker/worker.ts

Resources

Support

For issues or questions:

  • GitHub Issues: https://github.com/jaypatrick/adblock-compiler/issues
  • Cloudflare Discord: https://discord.gg/cloudflaredev

Cloudflare Pages Deployment Guide

This guide explains how to deploy the Adblock Compiler UI to Cloudflare Pages.

Overview

This project uses Cloudflare Workers for the main API/compiler service and Cloudflare Pages for hosting the static UI files in the public/ directory.

Important: Do NOT use deno deploy

⚠️ Common Mistake: This project is NOT deployed using deno deploy. While this is a Deno-based project, deployment to Cloudflare uses Wrangler, not Deno Deploy.

Why not Deno Deploy?

  • This project targets Cloudflare Workers runtime, not Deno Deploy
  • The worker uses Cloudflare-specific bindings (KV, R2, D1, etc.)
  • The deployment is managed through Wrangler CLI

Deployment Options

The repository includes automated CI/CD that deploys to Cloudflare Workers and Pages automatically.

See .github/workflows/ci.yml for the deployment configuration.

Requirements:

  • Set repository secrets:
    • CLOUDFLARE_API_TOKEN
    • CLOUDFLARE_ACCOUNT_ID
  • Enable deployment by setting repository variable:
    • ENABLE_CLOUDFLARE_DEPLOY=true

Option 2: Manual Deployment

Workers Deployment

# Install dependencies
npm install

# Deploy worker
deno task wrangler:deploy
# or
wrangler deploy

Angular SPA Deployment (Frontend)

The Angular frontend is deployed as part of the Cloudflare Workers bundle via the Worker's ASSETS binding, not as a standalone Cloudflare Pages project. (The "Cloudflare Pages" sections below cover only the legacy public/ static UI.) The build process requires a postbuild step because Angular's SSR builder with RenderMode.Client emits index.csr.html instead of index.html:

cd frontend

# npm run build automatically runs the postbuild lifecycle hook:
#   1. ng build  → emits dist/frontend/browser/index.csr.html
#   2. postbuild → copies index.csr.html to index.html
npm run build

# Deploy the Worker (which serves the Angular SPA via ASSETS binding)
deno task wrangler:deploy

The postbuild step is handled by frontend/scripts/postbuild.js. If you skip the postbuild, the Cloudflare Worker ASSETS binding falls back to index.csr.html, but the recommended path is always to run npm run build (not ng build directly).

SPA Routing (Worker): The Cloudflare Worker already handles SPA fallback — extensionless paths not matched by API routes are served the Angular shell (index.html) via the ASSETS binding. SPA Routing (Pages-only): If you deploy the Angular dist/ output directly to Cloudflare Pages instead of serving it via the Worker ASSETS binding, you can use a _redirects file for SPA routing. In that setup, frontend/src/_redirects should contain /* /index.html 200, and this file is copied into the browser output root during the Angular build via angular.json's assets configuration.

Pages Deployment (Legacy static UI — Retired)

⚠️ Retired: The adblock-compiler-ui Cloudflare Pages project has been retired. The Angular SPA is now served exclusively via the Worker's [assets] binding at https://adblock-compiler.jayson-knight.workers.dev. The CI steps that deployed to Pages have been removed.

The command below is kept for historical reference only and should not be used:

# RETIRED — do not use
# wrangler pages deploy public --project-name=adblock-compiler-ui

Cloudflare Pages Dashboard Configuration

If you're setting up Cloudflare Pages through the dashboard, use these settings:

Build Configuration

SettingValue
Framework presetNone
Build commandnpm install
Build output directorypublic
Root directory(leave empty)

Environment Variables

VariableValue
NODE_VERSION22

⚠️ Critical: Deploy Command

DO NOT set a deploy command to deno deploy. This will cause errors because:

  1. Deno is not installed in the Cloudflare Pages build environment by default
  2. This project uses Wrangler for deployment, not Deno Deploy
  3. The static files in public/ don't require any build step

Correct configuration:

  • Deploy command: Leave empty or use echo "No deploy command needed"
  • The public/ directory contains pre-built static files that are served directly

Common Errors

Error: /bin/sh: 1: deno: not found

Symptom:

Executing user deploy command: deno deploy
/bin/sh: 1: deno: not found
Failed: error occurred while running deploy command

Solution: Remove or change the deploy command in Cloudflare Pages dashboard settings:

  1. Go to Pages project settings
  2. Navigate to "Builds & deployments"
  3. Under "Build configuration", clear the "Deploy command" field
  4. Save changes

Error: Build fails with missing dependencies

Solution: Ensure the build command is set to npm install (not npm run build or other commands).

Architecture

flowchart TB
    PAGES["Cloudflare Pages"]
    subgraph STATIC["Static Files (public/)"]
        I["index.html (Admin Dashboard)"]
        C["compiler.html (Compiler UI)"]
        T["test.html (API Tester)"]
    end
    WORKERS["Cloudflare Workers"]
    subgraph WORKER_INNER["Worker (worker/worker.ts)"]
        API["API endpoints"]
        SVC["Compiler service"]
        BINDINGS["KV, R2, D1 bindings"]
    end

    PAGES --> I
    PAGES --> C
    PAGES --> T
    PAGES -->|calls| WORKERS
    WORKERS --> API
    WORKERS --> SVC
    WORKERS --> BINDINGS

Verification

After deployment, verify:

  1. Pages URL: https://YOUR-PROJECT.pages.dev

    • Should show the admin dashboard
    • Should load without errors
  2. Worker URL: https://adblock-compiler.YOUR-SUBDOMAIN.workers.dev

    • API endpoints should respond
    • /api should return API documentation
  3. Integration: The Pages UI should successfully call the Worker API

Troubleshooting

Pages deployment works but Worker calls fail

Cause: CORS issues or incorrect Worker URL in UI

Solution:

  1. Check that the Worker URL in the UI matches your deployed Worker
  2. Ensure CORS is configured correctly in worker/worker.ts
  3. Verify the Worker is deployed and accessible

UI shows but API calls return 404

Cause: Worker not deployed or incorrect API endpoint

Solution:

  1. Deploy the Worker: wrangler deploy
  2. Update the API endpoint URL in the UI files if needed
  3. Check Worker logs: wrangler tail

Support

For issues related to deployment, please:

  1. Check this documentation first
  2. Review the Troubleshooting Guide
  3. Open an issue on GitHub with deployment logs

Cloudflare Workers Architecture

This document describes the two Cloudflare Workers deployments that make up the Adblock Compiler service, the differences between them, and how they relate to each other.


Overview

The Adblock Compiler is deployed as two separate Cloudflare Workers from a single GitHub repository. Each has a distinct role:

adblock-compiler-backendadblock-compiler-frontend
Wrangler configwrangler.tomlfrontend/wrangler.toml
Entry pointworker/worker.tsdist/adblock-compiler/server/server.mjs
RoleREST API + compilation engineAngular 21 SSR UI
Source pathworker/ + src/frontend/
Deploy commandwrangler deploy (repo root)npm run deploy (from frontend/)
Local dev port87878787 (via npm run preview)

adblock-compiler-backend — The API Worker

What It Does

The backend worker is the compilation engine. It:

  • Exposes a REST API (POST /compile, POST /compile/stream, POST /compile/batch, GET /metrics, etc.)
  • Runs adblock/hostlist filter list compilation using the core src/ TypeScript logic (forked from AdguardTeam/HostlistCompiler)
  • Handles async queue-based compilation via Cloudflare Queues
  • Manages caching, rate limiting, and metrics via KV namespaces
  • Stores compiled outputs in R2 and persists state in D1 + Durable Objects
  • Runs scheduled background jobs (cache warming, health monitoring) via Cloudflare Workflows + Cron Triggers
  • Also serves the compiled Angular frontend as static assets via its [assets] binding (bundled deployment mode)

Source

adblock-compiler/
├── worker/
│   └── worker.ts          ← entry point
├── src/                   ← core compiler logic (forked from AdGuard HostlistCompiler)
└── wrangler.toml          ← deployment configuration (name = "adblock-compiler-backend")

Key Bindings

BindingTypePurpose
COMPILATION_CACHEKVCache compiled filter lists
RATE_LIMITKVPer-IP rate limiting
METRICSKVMetrics counters
FILTER_STORAGER2Store compiled filter list outputs
DBD1SQLite edge database
ADBLOCK_COMPILERDurable ObjectStateful compilation sessions
HYPERDRIVEHyperdriveAccelerated PostgreSQL access
ANALYTICS_ENGINEAnalytics EngineHigh-cardinality telemetry
ASSETSStatic AssetsServes compiled Angular frontend (bundled mode)

adblock-compiler-frontend — The UI Worker

What It Does

The frontend worker is the Angular 21 SSR application. It:

  • Server-side renders the Angular application at the Cloudflare edge using AngularAppEngine
  • Serves the home page as a prerendered static page (SSG); all other routes are SSR per-request
  • Serves JS/CSS/font bundles directly from Cloudflare's CDN via the ASSETS binding (the Worker never handles these requests)
  • Calls the adblock-compiler-backend worker's REST API for all compilation operations

Source

adblock-compiler/
└── frontend/
    ├── src/               ← Angular 21 application source
    ├── server.ts          ← Cloudflare Workers fetch handler (AngularAppEngine)
    └── wrangler.toml      ← deployment configuration (name = "adblock-compiler-frontend")

Key Bindings

BindingTypePurpose
ASSETSStatic AssetsJS bundles, CSS, fonts — served from CDN before the Worker is invoked

SSR Architecture

The server.ts fetch handler uses Angular 21's AngularAppEngine with the standard WinterCG fetch API — no Express, no Node.js HTTP server:

const angularApp = new AngularAppEngine();

export default {
    async fetch(request: Request, env: Env, ctx: ExecutionContext): Promise<Response> {
        const response = await angularApp.handle(request);
        return response ?? new Response('Not found', { status: 404 });
    },
} satisfies ExportedHandler<Env>;

This means:

  • Edge-compatible — runs in any WinterCG-compliant runtime (Cloudflare Workers, Deno Deploy, Fastly Compute)
  • Fast cold starts — no Express middleware chain, no Node.js HTTP server initialisation
  • Zero-overhead static assets — JS/CSS/fonts are served by Cloudflare CDN before the Worker is ever invoked

Relationship Between the Two Workers

Browser Request
      │
      ▼
┌─────────────────────────────────────────────┐
│         Cloudflare Edge Network             │
│                                             │
│  ┌──────────────────────────────────────┐   │
│  │  adblock-compiler-frontend           │   │
│  │  (Angular 21 SSR Worker)             │   │
│  │                                      │   │
│  │  • Prerendered home page (SSG)        │   │
│  │  • SSR for /compiler, /performance,  │   │
│  │    /admin, /api-docs, /validation    │   │
│  │  • Static assets served from CDN     │   │
│  │    via ASSETS binding (bypasses      │   │
│  │    Worker fetch handler entirely)    │   │
│  └───────────────┬──────────────────────┘   │
│                  │ API calls                │
│                  ▼                          │
│  ┌──────────────────────────────────────┐   │
│  │  adblock-compiler-backend            │   │
│  │  (TypeScript REST API Worker)        │   │
│  │                                      │   │
│  │  • POST /compile                     │   │
│  │  • POST /compile/stream (SSE)        │   │
│  │  • POST /compile/batch               │   │
│  │  • GET  /metrics                     │   │
│  │  • GET  /health                      │   │
│  │  • KV, R2, D1, Durable Objects,      │   │
│  │    Queues, Workflows, Hyperdrive      │   │
│  └──────────────────────────────────────┘   │
└─────────────────────────────────────────────┘

Two Deployment Modes

The backend worker supports two ways the frontend can be served:

1. Bundled Mode (single worker)

The root wrangler.toml includes an [assets] block pointing to the Angular build output:

[assets]
directory = "./frontend/dist/adblock-compiler/browser"
binding = "ASSETS"

This means a single wrangler deploy from the repo root deploys both the API and the Angular frontend as one unit. The Worker serves API requests; static assets are served by Cloudflare CDN via the binding.

2. Independent SSR Mode (two separate workers)

frontend/wrangler.toml deploys the Angular application as its own Worker with full SSR (AngularAppEngine). This is the adblock-compiler-frontend worker. It runs server-side rendering at the edge and calls the backend API for data.

Bundled ModeIndependent SSR Mode
Workers deployed1 (adblock-compiler-backend)2 (backend + frontend)
Frontend servingStatic assets via CDN bindingAngularAppEngine SSR + CDN for assets
SSR supportNo (SPA only)Yes (prerender + server rendering)
Deploy commandwrangler deploy (root)wrangler deploy (root) + npm run deploy (frontend/)
Use caseSimpler deployment, CSR onlyFull SSR, edge rendering, independent scaling

Deployment

Backend

# From repo root
wrangler deploy

Frontend (Independent SSR mode)

cd frontend
npm run build    # ng build — compiles Angular + server.mjs
npm run deploy   # wrangler deploy

Local Development

# Backend API
wrangler dev                        # → http://localhost:8787

# Frontend (Angular dev server, CSR)
cd frontend && npm start            # → http://localhost:4200

# Frontend (Cloudflare Workers preview, mirrors production SSR)
cd frontend && npm run preview      # → http://localhost:8787

Renaming Note

These workers were renamed as of 2026-03-07.

Old nameNew name
adblock-compileradblock-compiler-backend
adblock-compiler-angular-pocadblock-compiler-frontend

| If you have existing workers under the old names in your Cloudflare dashboard, they will continue to run until manually deleted. The next wrangler deploy will create new workers under the updated names.


Further Reading

Deployment Versioning System

The adblock-compiler project includes an automated deployment versioning system that tracks every successful worker deployment with detailed metadata.

Overview

Every deployment is assigned a unique version identifier that includes:

  • Semantic version (e.g., 0.11.3) from deno.json
  • Build number (auto-incrementing per version)
  • Full version (e.g., 0.11.3+build.42)
  • Git commit SHA and branch
  • Deployment timestamp and actor
  • CI/CD workflow metadata

Architecture

Components

  1. Database Schema (migrations/0002_deployment_history.sql)

    • deployment_history table: Records all deployments
    • deployment_counter table: Tracks build numbers per version
  2. Version Utilities (src/deployment/version.ts)

    • Functions to query and manage deployment history
    • TypeScript interfaces for deployment records
  3. Pre-deployment Script (scripts/generate-deployment-version.ts)

    • Generates build number before deployment
    • Creates full version string
    • Outputs version info for CI/CD
  4. Post-deployment Script (scripts/record-deployment.ts)

    • Records successful/failed deployments in D1
    • Collects git and CI/CD metadata
  5. Worker API Endpoints

    • GET /api/version - Current deployment version
    • GET /api/deployments - Deployment history
    • GET /api/deployments/stats - Deployment statistics

How It Works

Deployment Flow

1. CI/CD Trigger (push to main)
   ↓
2. Run Database Migrations
   ↓
3. Generate Deployment Version
   - Query D1 for last build number
   - Increment build number
   - Create full version string
   ↓
4. Deploy Worker
   ↓
5. Record Deployment (on success)
   - Insert deployment record into D1
   - Include git metadata, timestamps, etc.

Version Format

Full versions follow the format: {semantic-version}+build.{build-number}

Examples:

  • 0.11.3+build.1 - First deployment of version 0.11.3
  • 0.11.3+build.42 - 42nd deployment of version 0.11.3
  • 0.12.0+build.1 - First deployment of version 0.12.0

Build Number Tracking

Build numbers are tracked per semantic version:

  • When you bump from 0.11.3 to 0.11.4, build numbers reset to 1
  • Each deployment of the same version increments the build number
  • Build numbers are persisted in the deployment_counter table

Database Schema

deployment_history Table

CREATE TABLE deployment_history (
    id TEXT PRIMARY KEY,                 -- Unique deployment ID
    version TEXT NOT NULL,               -- Semantic version (0.11.3)
    build_number INTEGER NOT NULL,       -- Build number (42)
    full_version TEXT NOT NULL,          -- Full version (0.11.3+build.42)
    git_commit TEXT NOT NULL,            -- Git commit SHA
    git_branch TEXT NOT NULL,            -- Git branch (main)
    deployed_at TEXT NOT NULL,           -- ISO timestamp
    deployed_by TEXT NOT NULL,           -- Actor (github-actions[user])
    status TEXT NOT NULL,                -- success|failed|rollback
    deployment_duration INTEGER,         -- Duration in ms
    workflow_run_id TEXT,                -- GitHub workflow run ID
    workflow_run_url TEXT,               -- GitHub workflow run URL
    metadata TEXT                        -- Additional JSON metadata
);

deployment_counter Table

CREATE TABLE deployment_counter (
    version TEXT PRIMARY KEY,            -- Semantic version
    last_build_number INTEGER NOT NULL,  -- Last used build number
    updated_at TEXT NOT NULL             -- Last update timestamp
);

API Endpoints

GET /api/version

Returns the current deployed version.

Response:

{
  "success": true,
  "version": "0.11.3",
  "buildNumber": 42,
  "fullVersion": "0.11.3+build.42",
  "gitCommit": "abc123def456",
  "gitBranch": "main",
  "deployedAt": "2026-01-31 07:00:00",
  "deployedBy": "github-actions[user]",
  "status": "success"
}

GET /api/deployments

Returns deployment history with optional filters.

Query Parameters:

  • limit (default: 50) - Number of deployments to return
  • version - Filter by semantic version
  • status - Filter by status (success|failed|rollback)
  • branch - Filter by git branch

Example:

curl "https://your-worker.dev/api/deployments?limit=10&version=0.11.3"

Response:

{
  "success": true,
  "deployments": [
    {
      "version": "0.11.3",
      "buildNumber": 42,
      "fullVersion": "0.11.3+build.42",
      "gitCommit": "abc123def456",
      "gitBranch": "main",
      "deployedAt": "2026-01-31 07:00:00",
      "deployedBy": "github-actions[user]",
      "status": "success",
      "metadata": {
        "ci_platform": "github-actions",
        "workflow_run_id": "12345",
        "workflow_run_url": "https://github.com/..."
      }
    }
  ],
  "count": 1
}

GET /api/deployments/stats

Returns deployment statistics.

Response:

{
  "success": true,
  "totalDeployments": 150,
  "successfulDeployments": 145,
  "failedDeployments": 5,
  "latestVersion": "0.11.3+build.42"
}

CI/CD Integration

The deployment versioning system is integrated into the GitHub Actions workflow (.github/workflows/ci.yml).

Deploy Job Steps

  1. Setup Deno - Required for scripts
  2. Run Database Migrations - Ensure schema is up to date
  3. Generate Deployment Version - Create version info
  4. Deploy Worker - Deploy to Cloudflare
  5. Record Deployment - Save deployment record

Environment Variables

The scripts require the following environment variables:

  • CLOUDFLARE_ACCOUNT_ID - Cloudflare account ID
  • CLOUDFLARE_API_TOKEN - Cloudflare API token
  • D1_DATABASE_ID - D1 database ID (optional, can be read from wrangler.toml)
  • GITHUB_SHA - Git commit SHA (auto-provided by GitHub Actions)
  • GITHUB_REF - Git ref (auto-provided by GitHub Actions)
  • GITHUB_ACTOR - GitHub actor (auto-provided by GitHub Actions)
  • GITHUB_RUN_ID - Workflow run ID (auto-provided by GitHub Actions)

Manual Usage

Generate Deployment Version

deno run --allow-read --allow-write --allow-net --allow-env \
  scripts/generate-deployment-version.ts

This creates a .deployment-version.json file with:

{
  "version": "0.11.3",
  "buildNumber": 42,
  "fullVersion": "0.11.3+build.42"
}

Record Deployment

After a successful deployment:

deno run --allow-read --allow-net --allow-env \
  scripts/record-deployment.ts --status=success

After a failed deployment:

deno run --allow-read --allow-net --allow-env \
  scripts/record-deployment.ts --status=failed

Querying Deployment History

Using TypeScript/Deno

import { getLatestDeployment, getDeploymentHistory, getDeploymentStats } from './src/deployment/version.ts';

// Assuming you have a D1 database instance
const db = /* your D1 database */;

// Get latest deployment
const latest = await getLatestDeployment(db);
console.log(latest?.fullVersion); // "0.11.3+build.42"

// Get deployment history
const history = await getDeploymentHistory(db, {
  limit: 10,
  version: '0.11.3',
});

// Get deployment stats
const stats = await getDeploymentStats(db);
console.log(`Total deployments: ${stats.totalDeployments}`);

Using D1 CLI

# Query latest deployment
wrangler d1 execute adblock-compiler-d1-database \
  --remote \
  --command "SELECT * FROM deployment_history WHERE status='success' ORDER BY deployed_at DESC LIMIT 1"

# Query deployment count by version
wrangler d1 execute adblock-compiler-d1-database \
  --remote \
  --command "SELECT version, COUNT(*) as count FROM deployment_history GROUP BY version"

# Query failed deployments
wrangler d1 execute adblock-compiler-d1-database \
  --remote \
  --command "SELECT * FROM deployment_history WHERE status='failed'"

Rollback Support

To mark a deployment as rolled back:

import { markDeploymentRollback } from './src/deployment/version.ts';

await markDeploymentRollback(db, '0.11.3+build.42');

This updates the deployment status to 'rollback' without deleting the record.

Troubleshooting

Build number not incrementing

Symptom: Build numbers stay at 1 or don't increment

Possible causes:

  • D1 credentials not available in CI/CD
  • Database migration not applied
  • Network connectivity issues with D1 API

Solution:

  1. Verify environment variables are set
  2. Check GitHub Actions secrets
  3. Manually run migrations: wrangler d1 execute adblock-compiler-d1-database --file=migrations/0002_deployment_history.sql --remote

Deployment not recorded

Symptom: Deployment succeeds but no record in database

Possible causes:

  • Post-deployment script failed
  • D1 credentials missing
  • Database migration not applied

Solution:

  1. Check GitHub Actions logs for script errors
  2. Verify D1 database ID matches wrangler.toml
  3. Manually record deployment using the script

API endpoints return 503

Symptom: /api/version returns "D1 database not available"

Possible causes:

  • D1 binding not configured in wrangler.toml
  • Database not created
  • Database ID incorrect

Solution:

  1. Verify D1 binding in wrangler.toml
  2. Create database if needed: wrangler d1 create adblock-compiler-d1-database
  3. Update database_id in wrangler.toml

Best Practices

  1. Always use CI/CD for deployments - Manual deployments won't be tracked
  2. Don't modify build numbers manually - Let the system auto-increment
  3. Keep deployment history - Don't delete old records, mark as rollback instead
  4. Monitor deployment stats - Use /api/deployments/stats to track success rate
  5. Use semantic versioning - Bump version in deno.json when releasing features

Future Enhancements

Potential improvements to the deployment versioning system:

  • Automated rollback on failed health checks
  • Deployment notifications (Slack, email)
  • Deployment approval workflow
  • A/B testing support with version tags
  • Performance metrics per deployment
  • Automated changelog generation from git commits

See Also

Docker

Production Readiness Assessment

Project: adblock-compiler Version: 0.11.7 Assessment Date: 2026-02-11 Assessment Scope: Logging, Validation, Exception Handling, Tracing, Diagnostics

Executive Summary

The adblock-compiler codebase demonstrates strong engineering fundamentals with comprehensive error handling, structured logging, and sophisticated diagnostics infrastructure. However, several gaps exist that should be addressed for production deployment at scale.

Overall Readiness: 🟡 Good Foundation, Needs Enhancement

Critical Areas:

  • Excellent: Error hierarchy, diagnostics infrastructure, transformation testing
  • 🟡 Good: Logging implementation, configuration validation, test coverage
  • 🔴 Needs Work: Observability export, input validation library, security headers

1. Logging System

Current State

Strengths:

  • ✅ Custom Logger class (src/utils/logger.ts) with hierarchical logging
  • ✅ Log levels: Trace, Debug, Info, Warn, Error
  • ✅ Child logger support with nested prefixes
  • ✅ Color-coded output for terminal readability
  • ✅ Silent logger for testing environments
  • ✅ Good test coverage (15 tests in logger.test.ts)

Issues:

🐛 BUG-001: Direct console.log/console.error usage bypasses logger

Severity: Medium Location: Multiple files

  • src/diagnostics/DiagnosticsCollector.ts:90-92, 128-130 (intentional warnings)
  • src/utils/EventEmitter.ts (console.error for handler exceptions)
  • src/queue/CloudflareQueueProvider.ts (console.error for queue errors)
  • src/services/AnalyticsService.ts (console.warn for failures)

Impact: Inconsistent logging, difficult to filter/route logs in production

Recommendation:

// Replace:
console.error('Queue error:', error);

// With:
this.logger.error('Queue error', { error });

🚀 FEATURE-001: Add structured JSON logging

Priority: High Justification: Production log aggregation systems (CloudWatch, Datadog, etc.) require structured logs

Implementation:

interface StructuredLog {
    timestamp: string;
    level: LogLevel;
    message: string;
    context?: Record<string, unknown>;
    correlationId?: string;
    traceId?: string;
}

class StructuredLogger extends Logger {
    log(level: LogLevel, message: string, context?: Record<string, unknown>) {
        const entry: StructuredLog = {
            timestamp: new Date().toISOString(),
            level,
            message,
            context,
            correlationId: this.correlationId,
        };
        console.log(JSON.stringify(entry));
    }
}

Files to modify:

  • src/utils/logger.ts - Add StructuredLogger class
  • src/types/index.ts - Add StructuredLog interface
  • Configuration option to enable JSON output

🚀 FEATURE-002: Per-module log level configuration

Priority: Medium Justification: Enable verbose logging for specific modules during debugging without flooding logs

Implementation:

interface LoggerConfig {
    defaultLevel: LogLevel;
    moduleOverrides?: Record<string, LogLevel>; // e.g., { 'compiler': LogLevel.Debug }
}

🚀 FEATURE-003: Log file output with rotation

Priority: Low Justification: Worker environments use stdout, but CLI could benefit from file logging

Implementation: Add optional file appender with size-based rotation


2. Input Validation

Current State

Strengths:

  • ✅ Pure TypeScript validation in ConfigurationValidator.ts
  • ✅ Detailed path-based error messages
  • ✅ Source URL, type, and transformation validation
  • ✅ Rate limiting middleware (worker/middleware/index.ts)
  • ✅ Admin auth and Turnstile verification

Issues:

✅ BUG-002: Request body size limits (RESOLVED)

Status: Fixed in commit 8b67d43 (2026-02-13) Location: worker/middleware/index.ts - validateRequestSize() function

Implementation:

  • Added validateRequestSize() middleware function
  • Configurable via MAX_REQUEST_BODY_MB environment variable
  • Default limit: 1MB
  • Returns 413 Payload Too Large for oversized requests
  • Validates both Content-Length header and actual body size

🐛 BUG-003: Weak type validation in compile handler

Severity: Medium Location: worker/handlers/compile.ts:85-95

Current Code:

const { configuration }

Issue: Type assertion without runtime validation - invalid data could pass through

Recommendation: Use validation before type assertion

🚀 FEATURE-004: Add Zod schema validation

Priority: High Justification: Type-safe runtime validation with zero dependencies for Deno

Implementation:

import { z } from "https://deno.land/x/zod/mod.ts";

const SourceSchema = z.object({
    source: z.string().url(),
    name: z.string().optional(),
    type: z.enum(['adblock', 'hosts']).optional(),
});

const ConfigurationSchema = z.object({
    name: z.string().min(1),
    description: z.string().optional(),
    sources: z.array(SourceSchema).nonempty(),
    transformations: z.array(z.nativeEnum(TransformationType)).optional(),
    exclusions: z.array(z.string()).optional(),
    inclusions: z.array(z.string()).optional(),
});

// Usage:
const config = ConfigurationSchema.parse(body.configuration);

Files to modify:

  • src/configuration/ConfigurationValidator.ts - Replace with Zod
  • worker/handlers/compile.ts - Add request body schema
  • deno.json - Add Zod dependency

🚀 FEATURE-005: Add URL allowlist/blocklist

Priority: Medium Justification: Prevent SSRF attacks by restricting source URLs to known domains

Implementation:

interface UrlValidationConfig {
    allowedDomains?: string[]; // e.g., ['raw.githubusercontent.com']
    blockedDomains?: string[]; // e.g., ['localhost', '127.0.0.1']
    allowPrivateIPs?: boolean; // default: false
}

3. Exception Handling

Current State

Strengths:

  • ✅ Comprehensive error hierarchy (src/utils/ErrorUtils.ts)
  • ✅ 8 custom error types with metadata
  • ✅ 18 error codes for categorization
  • ✅ Stack trace preservation and cause chain support
  • ✅ Retry detection via isRetryable()
  • ✅ Error formatting utilities
  • ✅ 96 try/catch blocks across codebase

Error Types:

  1. BaseError - Abstract base with code, timestamp, cause
  2. CompilationError - Compilation failures
  3. ConfigurationError - Invalid configs
  4. ValidationError - Validation with path and details
  5. NetworkError - HTTP errors with status and retry flag
  6. SourceError - Source download failures
  7. TransformationError - Transformation failures
  8. StorageError - Storage operation failures
  9. FileSystemError - File operation failures

Issues:

🐛 BUG-004: Silent error swallowing in FilterService

Severity: Medium Location: src/services/FilterService.ts:44

Current Code:

try {
    const content = await this.downloader.download(source);
    return content;
} catch (error) {
    this.logger.error(`Failed to download source: ${source}`, error);
    return ""; // Silent failure
}

Issue: Returns empty string on error, caller can't distinguish success from failure

Recommendation:

// Option 1: Let error propagate
throw ErrorUtils.wrap(error, `Failed to download source: ${source}`);

// Option 2: Return Result type
return { success: false, error: ErrorUtils.getMessage(error) };

🐛 BUG-005: Database errors not wrapped with custom types

Severity: Low Location: src/storage/PrismaAdapter.ts, src/storage/D1Adapter.ts

Current Code: Direct throw of Prisma/D1 errors

Recommendation: Wrap with StorageError for consistent error handling:

try {
    await this.prisma.compilation.create({ data });
} catch (error) {
    throw new StorageError(
        "Failed to create compilation record",
        ErrorCode.STORAGE_WRITE_FAILED,
        error,
    );
}

🚀 FEATURE-006: Centralized error reporting service

Priority: High Justification: Production systems need error aggregation (Sentry, Datadog, etc.)

Implementation:

interface ErrorReporter {
    report(error: Error, context?: Record<string, unknown>): void;
}

class SentryErrorReporter implements ErrorReporter {
    constructor(private dsn: string) {}

    report(error: Error, context?: Record<string, unknown>): void {
        // Send to Sentry with context
    }
}

class ConsoleErrorReporter implements ErrorReporter {
    report(error: Error, context?: Record<string, unknown>): void {
        console.error(ErrorUtils.format(error), context);
    }
}

Files to create:

  • src/utils/ErrorReporter.ts - Interface and implementations
  • Update all catch blocks to use reporter

🚀 FEATURE-007: Add error code documentation

Priority: Medium Justification: Developers and operators need to understand error codes

Implementation: Create docs/ERROR_CODES.md with:

  • Error code → meaning mapping
  • Recommended actions for each code
  • Example scenarios

🚀 FEATURE-008: Add circuit breaker pattern

Priority: High Justification: Prevent cascading failures when sources are consistently failing

Implementation:

class CircuitBreaker {
    private failureCount = 0;
    private state: "CLOSED" | "OPEN" | "HALF_OPEN" = "CLOSED";
    private lastFailureTime?: Date;

    constructor(
        private threshold: number = 5,
        private timeout: number = 60000, // 1 minute
    ) {}

    async execute<T>(fn: () => Promise<T>): Promise<T> {
        if (this.state === 'OPEN') {
            if (
                this.lastFailureTime &&
                Date.now() - this.lastFailureTime.getTime() > this.timeout
            ) {
                this.state = 'HALF_OPEN';
            } else {
                throw new Error('Circuit breaker is OPEN');
            }
        }

        try {
            const result = await fn();
            this.onSuccess();
            return result;
        } catch (error) {
            this.onFailure();
            throw error;
        }
    }

    private onSuccess(): void {
        this.failureCount = 0;
        this.state = 'CLOSED';
    }

    private onFailure(): void {
        this.failureCount++;
        this.lastFailureTime = new Date();

        if (this.failureCount >= this.threshold) {
            this.state = 'OPEN';
        }
    }
}

Files to create:

  • src/utils/CircuitBreaker.ts
  • src/utils/CircuitBreaker.test.ts
  • Integrate into src/downloader/FilterDownloader.ts

4. Tracing and Diagnostics

Current State

Strengths:

  • ✅ Comprehensive diagnostics system (src/diagnostics/)
  • ✅ 6 event types: Diagnostic, OperationStart, OperationComplete, OperationError, PerformanceMetric, Cache, Network
  • ✅ Event categories: Compilation, Download, Transformation, Cache, Validation, Network, Performance, Error
  • ✅ Correlation ID support for grouping events
  • ✅ Decorator support (@traced, @tracedAsync)
  • ✅ Wrapper functions (traceSync, traceAsync)
  • ✅ No-op implementation for disabled tracing
  • ✅ Test coverage (DiagnosticsCollector.test.ts, TracingContext.test.ts)

Issues:

🐛 BUG-006: Diagnostics events stored only in memory

Severity: High Location: src/diagnostics/DiagnosticsCollector.ts

Issue: Events collected in private events: DiagnosticEvent[] = [] but never exported

Recommendation: Add event export mechanism:

interface DiagnosticsExporter {
    export(events: DiagnosticEvent[]): Promise<void>;
}

class ConsoleDiagnosticsExporter implements DiagnosticsExporter {
    async export(events: DiagnosticEvent[]): Promise<void> {
        events.forEach((event) => console.log(JSON.stringify(event)));
    }
}

class CloudflareAnalyticsExporter implements DiagnosticsExporter {
    constructor(private analyticsEngine: AnalyticsEngine) {}

    async export(events: DiagnosticEvent[]): Promise<void> {
        for (const event of events) {
            this.analyticsEngine.writeDataPoint({
                indexes: [event.correlationId],
                blobs: [event.category, event.message],
                doubles: [event.timestamp.getTime()],
            });
        }
    }
}

🐛 BUG-007: No distributed trace ID propagation

Severity: Medium Location: Worker handlers don't propagate trace IDs across async operations

Recommendation: Add trace context to all async operations:

// Extract from request header
const traceId = request.headers.get('X-Trace-Id') || crypto.randomUUID();

// Pass to all operations
const context = createTracingContext({
    traceId,
    correlationId: crypto.randomUUID(),
});

🚀 FEATURE-009: Add OpenTelemetry integration

Priority: High Justification: Industry-standard distributed tracing compatible with all major platforms

Implementation:

import { SpanStatusCode, trace } from "@opentelemetry/api";

const tracer = trace.getTracer('adblock-compiler', VERSION);

async function compileWithTracing(config: IConfiguration): Promise<string> {
    return tracer.startActiveSpan('compile', async (span) => {
        try {
            span.setAttribute('config.name', config.name);
            span.setAttribute('config.sources.count', config.sources.length);

            const result = await compile(config);

            span.setStatus({ code: SpanStatusCode.OK });
            return result;
        } catch (error) {
            span.recordException(error);
            span.setStatus({ code: SpanStatusCode.ERROR });
            throw error;
        } finally {
            span.end();
        }
    });
}

Files to modify:

  • Add @opentelemetry/api dependency
  • Create src/diagnostics/OpenTelemetryExporter.ts
  • Update src/compiler/SourceCompiler.ts with spans

🚀 FEATURE-010: Add performance sampling

Priority: Medium Justification: Tracing all operations at high volume impacts performance

Implementation:

class SamplingDiagnosticsCollector extends DiagnosticsCollector {
    constructor(
        private samplingRate: number = 0.1, // 10%
        ...args
    ) {
        super(...args);
    }

    recordEvent(event: DiagnosticEvent): void {
        if (Math.random() < this.samplingRate) {
            super.recordEvent(event);
        }
    }
}

🚀 FEATURE-011: Add request duration histogram

Priority: Medium Justification: Understand performance distribution (p50, p95, p99)

Implementation: Record request durations in buckets for analysis


5. Testing and Quality

Current State

Strengths:

  • ✅ 63 test files across src/ and worker/
  • ✅ Unit tests for utilities, transformations, compilers
  • ✅ Integration tests for worker handlers
  • ✅ E2E tests for API, WebSocket, SSE
  • ✅ Contract tests for OpenAPI spec
  • ✅ Coverage reporting configured

Issues:

🐛 BUG-008: No public coverage reports

Severity: Low Location: Coverage generated locally but not published

Recommendation:

  1. Add Codecov integration to CI workflow
  2. Generate coverage badge for README
  3. Track coverage trends over time

🐛 BUG-009: E2E tests require running server

Severity: Low Location: worker/api.e2e.test.ts, worker/websocket.e2e.test.ts

Issue: Tests marked as ignore: true by default, require manual server start

Recommendation: Add test server lifecycle management:

let server: Deno.HttpServer;

Deno.test({
    name: 'API E2E tests',
    async fn(t) {
        // Start server
        server = Deno.serve({ port: 8787 }, handler);

        await t.step('POST /compile', async () => {
            // Test here
        });

        // Cleanup
        await server.shutdown();
    },
});

🚀 FEATURE-012: Add mutation testing

Priority: Low Justification: Verify test effectiveness by introducing mutations

Implementation: Use Stryker or similar tool to mutate code and verify tests catch changes

🚀 FEATURE-013: Add performance benchmarks

Priority: Medium Justification: Track performance regressions over time

Current: Only 4 bench files exist (utils, transformations)

Recommendation: Add benchmarks for:

  • Compilation of various list sizes
  • Transformation pipeline performance
  • Cache hit/miss scenarios
  • Network fetch with retries

6. Security

Current State

Strengths:

  • ✅ Rate limiting middleware
  • ✅ Admin authentication with API keys
  • ✅ Turnstile CAPTCHA verification
  • ✅ IP extraction from Cloudflare headers

Issues:

🐛 BUG-010: No CSRF protection

Severity: High Location: Worker endpoints accept POST without CSRF tokens

Recommendation: Add CSRF token validation for state-changing operations:

function validateCsrfToken(request: Request): boolean {
    const token = request.headers.get('X-CSRF-Token');
    const cookie = getCookie(request, 'csrf-token');
    return token && cookie && token === cookie;
}

🐛 BUG-011: Missing security headers

Severity: Medium Location: Worker responses don't include security headers

Recommendation: Add middleware for security headers:

function addSecurityHeaders(response: Response): Response {
    const headers = new Headers(response.headers);
    headers.set('X-Content-Type-Options', 'nosniff');
    headers.set('X-Frame-Options', 'DENY');
    headers.set('X-XSS-Protection', '1; mode=block');
    headers.set('Content-Security-Policy', "default-src 'self'");
    headers.set(
        'Strict-Transport-Security',
        'max-age=31536000; includeSubDomains',
    );

    return new Response(response.body, {
        status: response.status,
        headers,
    });
}

🐛 BUG-012: No SSRF protection for source URLs

Severity: High Location: src/downloader/FilterDownloader.ts fetches arbitrary URLs

Recommendation: Validate URLs before fetching:

function isSafeUrl(url: string): boolean {
    const parsed = new URL(url);

    // Block private IPs
    if (
        parsed.hostname === 'localhost' ||
        parsed.hostname.startsWith('127.') ||
        parsed.hostname.startsWith('192.168.') ||
        parsed.hostname.startsWith('10.') ||
        /^172\.(1[6-9]|2[0-9]|3[0-1])\./.test(parsed.hostname)
    ) {
        return false;
    }

    // Only allow http/https
    if (!['http:', 'https:'].includes(parsed.protocol)) {
        return false;
    }

    return true;
}

🚀 FEATURE-014: Add rate limiting per endpoint

Priority: High Justification: Different endpoints have different resource costs

Implementation:

const RATE_LIMITS: Record<string, { window: number; max: number }> = {
    '/compile': { window: 60, max: 10 },
    '/health': { window: 60, max: 1000 },
    '/admin/analytics': { window: 60, max: 100 },
};

🚀 FEATURE-015: Add request signing for admin endpoints

Priority: Medium Justification: API key authentication alone is vulnerable to replay attacks

Implementation: HMAC-based request signing with timestamp validation


7. Observability and Monitoring

Issues:

🚀 FEATURE-016: Add health check endpoint enhancements

Priority: High Justification: Current health check only returns OK, doesn't check dependencies

Current: worker/handlers/health.ts returns simple { status: 'ok' }

Recommendation:

interface HealthCheckResult {
    status: 'healthy' | 'degraded' | 'unhealthy';
    version: string;
    uptime: number;
    checks: {
        database?: { status: string; latency?: number };
        cache?: { status: string; hitRate?: number };
        sources?: { status: string; failedCount?: number };
    };
}

🚀 FEATURE-017: Add metrics export endpoint

Priority: High Justification: Prometheus/Datadog need metrics in standard format

Implementation:

// GET /metrics
function exportMetrics(): string {
    return `
# HELP compilation_duration_seconds Time to compile filter lists
# TYPE compilation_duration_seconds histogram
compilation_duration_seconds_bucket{le="1"} 45
compilation_duration_seconds_bucket{le="5"} 123
compilation_duration_seconds_count 150

# HELP compilation_total Total compilations
# TYPE compilation_total counter
compilation_total{status="success"} 145
compilation_total{status="error"} 5
    `.trim();
}

🚀 FEATURE-018: Add dashboard for diagnostics

Priority: Low Justification: Real-time visibility into system health

Implementation: Web UI showing:

  • Active compilations
  • Error rates
  • Cache hit ratios
  • Source health status
  • Circuit breaker states

8. Configuration and Deployment

Issues:

🚀 FEATURE-019: Add configuration validation on startup

Priority: Medium Justification: Fail fast if environment variables are missing/invalid

Implementation:

function validateEnvironment(): void {
    const required = ['DATABASE_URL', 'ADMIN_API_KEY'];
    const missing = required.filter((key) => !Deno.env.get(key));

    if (missing.length > 0) {
        throw new Error(
            `Missing required environment variables: ${missing.join(', ')}`,
        );
    }
}

// Call on startup
validateEnvironment();

🚀 FEATURE-020: Add graceful shutdown

Priority: Medium Justification: Allow in-flight requests to complete before shutdown

Implementation:

let isShuttingDown = false;

Deno.addSignalListener('SIGTERM', () => {
    isShuttingDown = true;
    logger.info('Received SIGTERM, gracefully shutting down');

    setTimeout(() => {
        logger.error('Forced shutdown after timeout');
        Deno.exit(1);
    }, 30000); // 30 second timeout
});

// In request handler
if (isShuttingDown) {
    return new Response('Service shutting down', { status: 503 });
}

9. Documentation

Issues:

🚀 FEATURE-021: Add runbook for common operations

Priority: High Justification: Operators need clear procedures for incidents

Create: docs/RUNBOOK.md with:

  • How to investigate compilation failures
  • How to handle rate limit issues
  • How to restart services
  • How to check database health
  • How to review diagnostic events

🚀 FEATURE-022: Add API documentation

Priority: Medium Justification: External users need clear API reference

Current: OpenAPI spec exists at worker/openapi.ts

Recommendation: Generate HTML documentation from spec


Priority Matrix

Critical (Must Fix Before Production)

  1. 🚀 FEATURE-001: Structured JSON logging
  2. 🚀 FEATURE-004: Zod schema validation
  3. 🚀 FEATURE-006: Centralized error reporting
  4. 🚀 FEATURE-008: Circuit breaker pattern
  5. 🚀 FEATURE-009: OpenTelemetry integration
  6. 🐛 BUG-002: Request body size limits ✅ RESOLVED
  7. 🐛 BUG-006: Diagnostics event export
  8. 🐛 BUG-010: CSRF protection
  9. 🐛 BUG-012: SSRF protection
  10. 🚀 FEATURE-014: Per-endpoint rate limiting
  11. 🚀 FEATURE-016: Enhanced health checks
  12. 🚀 FEATURE-021: Operational runbook

High Priority (Should Fix Soon)

  1. 🐛 BUG-001: Eliminate direct console usage
  2. 🐛 BUG-003: Type validation in handlers
  3. 🐛 BUG-004: Silent error swallowing
  4. 🐛 BUG-007: Distributed trace ID propagation
  5. 🐛 BUG-011: Security headers
  6. 🚀 FEATURE-005: URL allowlist/blocklist
  7. 🚀 FEATURE-017: Metrics export endpoint

Medium Priority (Nice to Have)

  1. 🚀 FEATURE-002: Per-module log levels
  2. 🚀 FEATURE-007: Error code documentation
  3. 🚀 FEATURE-010: Performance sampling
  4. 🚀 FEATURE-011: Request duration histogram
  5. 🚀 FEATURE-013: Performance benchmarks
  6. 🚀 FEATURE-015: Request signing
  7. 🚀 FEATURE-019: Startup config validation
  8. 🚀 FEATURE-020: Graceful shutdown
  9. 🚀 FEATURE-022: API documentation
  10. 🐛 BUG-005: Database error wrapping

Low Priority (Future Enhancement)

  1. 🚀 FEATURE-003: Log file output
  2. 🚀 FEATURE-012: Mutation testing
  3. 🚀 FEATURE-018: Diagnostics dashboard
  4. 🐛 BUG-008: Public coverage reports
  5. 🐛 BUG-009: E2E test automation

Implementation Roadmap

Phase 1: Core Observability (2-3 weeks)

  • Structured JSON logging (FEATURE-001)
  • Centralized error reporting (FEATURE-006)
  • OpenTelemetry integration (FEATURE-009)
  • Diagnostics event export (BUG-006)
  • Enhanced health checks (FEATURE-016)
  • Metrics export (FEATURE-017)

Phase 2: Security Hardening (1-2 weeks)

  • Request size limits (BUG-002) ✅ RESOLVED
  • CSRF protection (BUG-010)
  • SSRF protection (BUG-012)
  • Security headers (BUG-011)
  • Per-endpoint rate limiting (FEATURE-014)

Phase 3: Input Validation (1 week)

  • Zod schema validation (FEATURE-004)
  • Type validation in handlers (BUG-003)
  • URL allowlist/blocklist (FEATURE-005)
  • Startup config validation (FEATURE-019)

Phase 4: Resilience (1-2 weeks)

  • Circuit breaker pattern (FEATURE-008)
  • Distributed trace ID propagation (BUG-007)
  • Graceful shutdown (FEATURE-020)
  • Silent error handling fixes (BUG-004, BUG-005)

Phase 5: Developer Experience (1 week)

  • Eliminate direct console usage (BUG-001)
  • Error code documentation (FEATURE-007)
  • Operational runbook (FEATURE-021)
  • API documentation (FEATURE-022)

Phase 6: Performance & Quality (ongoing)

  • Performance sampling (FEATURE-010)
  • Request duration metrics (FEATURE-011)
  • Performance benchmarks (FEATURE-013)
  • Mutation testing (FEATURE-012)
  • E2E test automation (BUG-009)

Testing Strategy

Each change should include:

  1. Unit Tests: Test individual components in isolation
  2. Integration Tests: Test component interactions
  3. E2E Tests: Test complete user workflows
  4. Performance Tests: Verify no performance regression
  5. Security Tests: Verify security controls work

Success Metrics

Pre-Production Checklist

  • All critical issues resolved
  • All high-priority issues resolved
  • Test coverage >80%
  • Load testing completed (1000 req/s)
  • Security audit passed
  • Disaster recovery plan documented
  • Monitoring dashboards configured
  • On-call runbook created
  • Incident response plan established

Production Health Indicators

  • Error Rate: <0.1% of requests
  • Latency: p95 <2s, p99 <5s
  • Availability: >99.9% uptime
  • Cache Hit Rate: >70%
  • Source Success Rate: >95%

Conclusion

The adblock-compiler codebase demonstrates strong engineering foundations with excellent error handling and diagnostics infrastructure. The primary gaps are around observability export, input validation, and security hardening.

Recommended Next Steps:

  1. Implement Phase 1 (Core Observability) immediately
  2. Follow with Phase 2 (Security Hardening)
  3. Continue with Phases 3-6 based on business priorities

Estimated Total Effort: 8-12 weeks for all phases

With these improvements, the system will be production-ready for high-scale deployment with excellent observability, security, and reliability.

Development Documentation

Technical documentation for developers working on or extending the Adblock Compiler.

Contents

Adblock Compiler — System Architecture

A comprehensive breakdown of the adblock-compiler system: modules, sub-modules, services, data flow, and deployment targets.


Table of Contents

  1. High-Level Overview
  2. System Context Diagram
  3. Core Compilation Pipeline
  4. Module Map
  5. Detailed Module Breakdown
  6. Cloudflare Worker (worker/)
  7. Web UI (public/)
  8. Cross-Cutting Concerns
  9. Data Flow Diagrams
  10. Deployment Architecture
  11. Technology Stack

High-Level Overview

The adblock-compiler is a compiler-as-a-service for adblock filter lists. It downloads filter list sources from remote URLs or local files, applies a configurable pipeline of transformations, and produces optimized, deduplicated output. It runs in three modes:

ModeRuntimeEntry Point
CLIDenosrc/cli.ts / src/cli/CliApp.deno.ts
LibraryDeno / Node.jssrc/index.ts (JSR: @jk-com/adblock-compiler)
Edge APICloudflare Workersworker/worker.ts

System Context Diagram

graph TD
    subgraph EW["External World"]
        FLS["Filter List Sources<br/>(URLs/Files)"]
        WB["Web Browser<br/>(Web UI)"]
        AC["API Consumers<br/>(CI/CD, scripts)"]
    end

    subgraph ACS["adblock-compiler System"]
        CLI["CLI App<br/>(Deno)"]
        WUI["Web UI<br/>(Static)"]
        CFW["Cloudflare Worker<br/>(Edge API)"]
        CORE["Core Library<br/>(FilterCompiler / WorkerCompiler)"]
        DL["Download & Fetch"]
        TP["Transform Pipeline"]
        VS["Validate & Schema"]
        ST["Storage & Cache"]
        DG["Diagnostics & Tracing"]
    end

    KV["Cloudflare KV<br/>(Cache, Rate Limit, Metrics)"]
    D1["Cloudflare D1<br/>(SQLite, Metadata)"]

    FLS --> CLI
    WB --> WUI
    AC --> CFW
    CLI --> CORE
    WUI --> CORE
    CFW --> CORE
    CORE --> DL
    CORE --> TP
    CORE --> VS
    CORE --> ST
    CORE --> DG
    ST --> KV
    ST --> D1

Core Compilation Pipeline

Every compilation—CLI, library, or API—follows this pipeline:

flowchart LR
    A["1. Config<br/>Loading"] --> B["2. Validate<br/>(Zod)"]
    B --> C["3. Download<br/>Sources"]
    C --> D["4. Per-Source<br/>Transforms"]
    D --> E["5. Merge<br/>All Sources"]
    E --> F["6. Global<br/>Transforms"]
    F --> G["7. Checksum<br/>& Header"]
    G --> H["8. Output<br/>(Rules)"]

Step-by-Step

StepComponentDescription
1ConfigurationLoader / API bodyLoad JSON configuration with source URLs and options
2ConfigurationValidator (Zod)Validate against ConfigurationSchema
3FilterDownloader / PlatformDownloaderFetch source content via HTTP, file system, or pre-fetched cache
4SourceCompiler + TransformationPipelineApply per-source transformations (e.g., remove comments, validate)
5FilterCompiler / WorkerCompilerMerge rules from all sources, apply exclusions/inclusions
6TransformationPipelineApply global transformations (e.g., deduplicate, compress)
7HeaderGenerator + checksum utilGenerate metadata header, compute checksum
8OutputWriter / HTTP response / SSE streamWrite to file, return JSON, or stream via SSE

Module Map

src/
├── index.ts                    # Library entry point (all public exports)
├── version.ts                  # Canonical VERSION constant
├── cli.ts / cli.deno.ts        # CLI entry points
│
├── compiler/                   # 🔧 Core compilation orchestration
│   ├── FilterCompiler.ts       #    Main compiler (file system access)
│   ├── SourceCompiler.ts       #    Per-source compilation
│   ├── IncrementalCompiler.ts  #    Incremental (delta) compilation
│   ├── HeaderGenerator.ts      #    Filter list header generation
│   └── index.ts
│
├── platform/                   # 🌐 Platform abstraction layer
│   ├── WorkerCompiler.ts       #    Edge/Worker compiler (no FS)
│   ├── HttpFetcher.ts          #    HTTP content fetcher
│   ├── PreFetchedContentFetcher.ts  # In-memory content provider
│   ├── CompositeFetcher.ts     #    Chain-of-responsibility fetcher
│   ├── PlatformDownloader.ts   #    Platform-agnostic downloader
│   ├── types.ts                #    IContentFetcher interface
│   └── index.ts
│
├── transformations/            # ⚙️ Rule transformation pipeline
│   ├── base/Transformation.ts  #    Abstract base classes
│   ├── TransformationRegistry.ts  # Registry + Pipeline
│   ├── CompressTransformation.ts
│   ├── DeduplicateTransformation.ts
│   ├── ValidateTransformation.ts
│   ├── RemoveCommentsTransformation.ts
│   ├── RemoveModifiersTransformation.ts
│   ├── ConvertToAsciiTransformation.ts
│   ├── InvertAllowTransformation.ts
│   ├── TrimLinesTransformation.ts
│   ├── RemoveEmptyLinesTransformation.ts
│   ├── InsertFinalNewLineTransformation.ts
│   ├── ExcludeTransformation.ts
│   ├── IncludeTransformation.ts
│   ├── ConflictDetectionTransformation.ts
│   ├── RuleOptimizerTransformation.ts
│   ├── TransformationHooks.ts
│   └── index.ts
│
├── downloader/                 # 📥 Filter list downloading
│   ├── FilterDownloader.ts     #    Deno-native downloader with retries
│   ├── ContentFetcher.ts       #    File system + HTTP abstraction
│   ├── PreprocessorEvaluator.ts  # !#if / !#include directives
│   ├── ConditionalEvaluator.ts #    Boolean expression evaluator
│   └── index.ts
│
├── configuration/              # ✅ Configuration validation
│   ├── ConfigurationValidator.ts  # Zod-based validator
│   ├── schemas.ts              #    Zod schemas for all request types
│   └── index.ts
│
├── config/                     # ⚡ Centralized constants & defaults
│   └── defaults.ts             #    NETWORK, WORKER, STORAGE defaults
│
├── storage/                    # 💾 Persistence & caching
│   ├── IStorageAdapter.ts      #    Abstract storage interface
│   ├── PrismaStorageAdapter.ts #    Prisma ORM adapter (SQLite default)
│   ├── D1StorageAdapter.ts     #    Cloudflare D1 adapter
│   ├── CachingDownloader.ts    #    Intelligent caching downloader
│   ├── ChangeDetector.ts       #    Content change detection
│   ├── SourceHealthMonitor.ts  #    Source health tracking
│   └── types.ts                #    StorageEntry, CacheEntry, etc.
│
├── services/                   # 🛠️ Business logic services
│   ├── FilterService.ts        #    Filter wildcard preparation
│   ├── ASTViewerService.ts     #    Rule AST parsing & display
│   ├── AnalyticsService.ts     #    Cloudflare Analytics Engine
│   └── index.ts
│
├── queue/                      # 📬 Async job queue
│   ├── IQueueProvider.ts       #    Abstract queue interface
│   ├── CloudflareQueueProvider.ts  # Cloudflare Queues impl
│   └── index.ts
│
├── diagnostics/                # 🔍 Observability & tracing
│   ├── DiagnosticsCollector.ts #    Event aggregation
│   ├── TracingContext.ts       #    Correlation & span management
│   ├── OpenTelemetryExporter.ts  # OTel bridge
│   ├── types.ts                #    DiagnosticEvent, TraceSeverity
│   └── index.ts
│
├── filters/                    # 🔍 Rule filtering
│   ├── RuleFilter.ts           #    Exclusion/inclusion pattern matching
│   └── index.ts
│
├── formatters/                 # 📄 Output formatting
│   ├── OutputFormatter.ts      #    Adblock, hosts, dnsmasq, etc.
│   └── index.ts
│
├── diff/                       # 📊 Diff reporting
│   ├── DiffReport.ts           #    Compilation diff generation
│   └── index.ts
│
├── plugins/                    # 🔌 Plugin system
│   ├── PluginSystem.ts         #    Plugin registry & loading
│   └── index.ts
│
├── deployment/                 # 🚀 Deployment tracking
│   └── version.ts              #    Deployment history & records
│
├── schemas/                    # 📋 JSON schemas
│   └── configuration.schema.json
│
├── types/                      # 📐 Core type definitions
│   ├── index.ts                #    IConfiguration, ISource, enums
│   ├── validation.ts           #    Validation-specific types
│   └── websocket.ts            #    WebSocket message types
│
├── utils/                      # 🧰 Shared utilities
│   ├── RuleUtils.ts            #    Rule parsing & classification
│   ├── StringUtils.ts          #    String manipulation
│   ├── TldUtils.ts             #    Top-level domain utilities
│   ├── Wildcard.ts             #    Glob/wildcard pattern matching
│   ├── CircuitBreaker.ts       #    Circuit breaker pattern
│   ├── AsyncRetry.ts           #    Retry with exponential backoff
│   ├── ErrorUtils.ts           #    Typed error hierarchy
│   ├── EventEmitter.ts         #    CompilerEventEmitter
│   ├── Benchmark.ts            #    Performance benchmarking
│   ├── BooleanExpressionParser.ts  # Boolean expression evaluation
│   ├── AGTreeParser.ts         #    AdGuard rule AST parser
│   ├── ErrorReporter.ts        #    Multi-target error reporting
│   ├── logger.ts               #    Logger, StructuredLogger
│   ├── checksum.ts             #    Filter list checksums
│   ├── headerFilter.ts         #    Header stripping utilities
│   └── PathUtils.ts            #    Safe path resolution
│
└── cli/                        # 💻 CLI application
    ├── CliApp.deno.ts          #    Main CLI app (Deno-specific)
    ├── ArgumentParser.ts       #    CLI argument parsing
    ├── ConfigurationLoader.ts  #    Config file loading
    ├── OutputWriter.ts         #    File output writing
    └── index.ts

worker/                         # ☁️ Cloudflare Worker
├── worker.ts                   #    Worker entry point
├── router.ts                   #    Modular request router
├── websocket.ts                #    WebSocket handler
├── html.ts                     #    Static HTML serving
├── schemas.ts                  #    API request validation
├── types.ts                    #    Env bindings, request/response types
├── tail.ts                     #    Tail worker (log consumer)
├── handlers/                   #    Route handlers
│   ├── compile.ts              #    Compilation endpoints
│   ├── metrics.ts              #    Metrics endpoints
│   ├── queue.ts                #    Queue management
│   └── admin.ts                #    Admin/D1 endpoints
├── middleware/                  #    Request middleware
│   └── index.ts                #    Rate limit, auth, size validation
├── workflows/                  #    Durable execution workflows
│   ├── CompilationWorkflow.ts
│   ├── BatchCompilationWorkflow.ts
│   ├── CacheWarmingWorkflow.ts
│   ├── HealthMonitoringWorkflow.ts
│   ├── WorkflowEvents.ts
│   └── types.ts
└── utils/                      #    Worker utilities
    ├── response.ts             #    JsonResponse helper
    └── errorReporter.ts        #    Worker error reporter

Detailed Module Breakdown

Compiler (src/compiler/)

The orchestration layer that drives the entire compilation process.

flowchart TD
    FC["FilterCompiler\n← Main entry point (has FS access)"]
    FC -->|uses| SC["SourceCompiler"]
    FC -->|uses| HG["HeaderGenerator"]
    FC -->|uses| TP["TransformationPipeline"]
    SC -->|uses| FD["FilterDownloader"]
ClassResponsibility
FilterCompilerOrchestrates full compilation: validation → download → transform → header → output. Has file system access via Deno.
SourceCompilerCompiles a single source: downloads content, applies per-source transformations.
IncrementalCompilerWraps FilterCompiler with content-hash-based caching; only recompiles changed sources. Uses ICacheStorage.
HeaderGeneratorGenerates metadata headers (title, description, version, timestamp, checksum placeholder).

Platform Abstraction (src/platform/)

Enables the compiler to run in environments without file system access (browsers, Cloudflare Workers, Deno Deploy).

flowchart TD
    WC["WorkerCompiler\n← No FS access"]
    WC -->|uses| CF["CompositeFetcher\n← Chain of Responsibility"]
    CF --> PFCF["PreFetchedContentFetcher"]
    CF --> HF["HttpFetcher\n(Fetch API)"]
ClassResponsibility
WorkerCompilerEdge-compatible compiler; delegates I/O to IContentFetcher chain.
IContentFetcherInterface: canHandle(source) + fetch(source).
HttpFetcherFetches via the standard Fetch API; works everywhere.
PreFetchedContentFetcherServes content from an in-memory map (for pre-fetched content from the worker).
CompositeFetcherTries fetchers in order; first match wins.
PlatformDownloaderPlatform-agnostic downloader with preprocessor directive support.

Transformations (src/transformations/)

The transformation pipeline uses the Strategy and Registry patterns.

flowchart TD
    TP["TransformationPipeline\n← Applies ordered transforms"]
    TP -->|delegates to| TR["TransformationRegistry\n← Maps type → instance"]
    TR -->|contains| ST1["SyncTransformation\n(Deduplicate)"]
    TR -->|contains| ST2["SyncTransformation\n(Compress)"]
    TR -->|contains| AT["AsyncTransformation\n(future async)"]

Base Classes:

ClassDescription
TransformationAbstract base; defines execute(rules): Promise<string[]>
SyncTransformationFor CPU-bound in-memory transforms; wraps sync method in Promise.resolve()
AsyncTransformationFor transforms needing I/O or external resources

Built-in Transformations:

TransformationTypeDescription
RemoveCommentsSyncStrips comment lines (!, #)
CompressSyncConverts hosts → adblock format, removes redundant rules
RemoveModifiersSyncStrips unsupported modifiers from rules
ValidateSyncValidates rules for DNS-level blocking, removes IPs
ValidateAllowIpSyncLike Validate but keeps IP address rules
DeduplicateSyncRemoves duplicate rules, preserves order
InvertAllowSyncConverts blocking rules to allow (exception) rules
RemoveEmptyLinesSyncStrips blank lines
TrimLinesSyncRemoves leading/trailing whitespace
InsertFinalNewLineSyncEnsures output ends with newline
ConvertToAsciiSyncConverts IDN/Unicode domains to punycode
ExcludeSyncApplies exclusion patterns
IncludeSyncApplies inclusion patterns
ConflictDetectionSyncDetects conflicting block/allow rules
RuleOptimizerSyncOptimizes and simplifies rules

Downloader (src/downloader/)

Handles fetching filter list content with preprocessor directive support.

flowchart TD
    FD["FilterDownloader\n← Static download() method"]
    FD -->|uses| CF["ContentFetcher\n(FS + HTTP)"]
    FD -->|uses| PE["PreprocessorEvaluator\n(!#if, !#include)"]
    PE -->|uses| CE["ConditionalEvaluator\n(boolean expr)"]
ClassResponsibility
FilterDownloaderDownloads from URLs or local files; supports retries, circuit breaker, exponential backoff.
ContentFetcherAbstraction over Deno.readTextFile and fetch() with DI interfaces (IFileSystem, IHttpClient).
PreprocessorEvaluatorProcesses !#if, !#else, !#endif, !#include, !#safari_cb_affinity directives.
ConditionalEvaluatorEvaluates boolean expressions with platform identifiers (e.g., windows && !android).

Configuration & Validation

src/configuration/ — Runtime validation:

ComponentDescription
ConfigurationValidatorValidates IConfiguration against Zod schemas; produces human-readable errors.
schemas.tsZod schemas for IConfiguration, ISource, CompileRequest, BatchRequest, HTTP options.

src/config/ — Centralized constants:

Constant GroupExamples
NETWORK_DEFAULTSTimeout (30s), max retries (3), circuit breaker threshold (5)
WORKER_DEFAULTSRate limit (10 req/60s), cache TTL (1h), max batch size (10)
STORAGE_DEFAULTSCache TTL (1h), max memory entries (100)
COMPILATION_DEFAULTSDefault source type (adblock), max concurrent downloads (10)
VALIDATION_DEFAULTSMax rule length (10K chars)
PREPROCESSOR_DEFAULTSMax include depth (10)

Storage (src/storage/)

Pluggable persistence layer with multiple backends.

flowchart TD
    ISA["IStorageAdapter\n← Abstract interface"]
    ISA --> PSA["PrismaStorageAdapter\n(SQLite, PostgreSQL, MySQL, etc.)"]
    ISA --> D1A["D1StorageAdapter\n(Edge)"]
    ISA --> MEM["(Memory) — Future"]
    CD["CachingDownloader"] -->|uses| ISA
    SHM["SourceHealthMonitor"] -->|uses| ISA
    CD -->|uses| CHD["ChangeDetector"]
ComponentDescription
IStorageAdapterInterface with hierarchical key-value ops, TTL support, filter list caching, compilation history.
PrismaStorageAdapterPrisma ORM backend: SQLite (default), PostgreSQL, MySQL, MongoDB, etc.
D1StorageAdapterCloudflare D1 (edge SQLite) backend.
CachingDownloaderWraps any IDownloader with caching, change detection, and health monitoring.
ChangeDetectorTracks content hashes to detect changes between compilations.
SourceHealthMonitorTracks fetch success/failure rates, latency, and health status per source.

Services (src/services/)

Higher-level business services.

ServiceResponsibility
FilterServiceDownloads exclusion/inclusion sources in parallel; prepares Wildcard patterns.
ASTViewerServiceParses adblock rules into structured AST using @adguard/agtree; provides category, type, syntax, properties.
AnalyticsServiceType-safe wrapper for Cloudflare Analytics Engine; tracks compilations, cache hits, rate limits, workflow events.

Queue (src/queue/)

Asynchronous job processing abstraction.

flowchart TD
    IQP["IQueueProvider\n← Abstract interface"]
    IQP --> CQP["CloudflareQueueProvider\n← Cloudflare Workers Queue binding"]
    CQP --> CM["CompileMessage\n(single compilation)"]
    CQP --> BCM["BatchCompileMessage\n(batch compilation)"]
    CQP --> CWM["CacheWarmMessage\n(cache warming)"]
    CQP --> HCM["HealthCheckMessage\n(source health checks)"]

Diagnostics & Tracing (src/diagnostics/)

End-to-end observability through the compilation pipeline.

flowchart LR
    TC["TracingContext\n(correlation ID, parent spans)"]
    DC["DiagnosticsCollector\n(event aggregation)"]
    OTE["OpenTelemetryExporter\n(Datadog, Honeycomb, Jaeger, etc.)"]
    TC --> DC
    DC -->|can export to| OTE
ComponentDescription
TracingContextCarries correlation ID, parent span, metadata through the pipeline.
DiagnosticsCollectorRecords operation start/end, network events, cache events, performance metrics.
OpenTelemetryExporterBridges to OpenTelemetry's Tracer API for distributed tracing integration.

Filters (src/filters/)

ComponentDescription
RuleFilterApplies exclusion/inclusion wildcard patterns to rule sets. Partitions into plain strings (fast) vs. regex/wildcards (slower) for optimized matching.

Formatters (src/formatters/)

ComponentDescription
OutputFormatterConverts adblock rules to multiple output formats: adblock, hosts (0.0.0.0), dnsmasq, plain domain list. Extensible via BaseFormatter.

Diff (src/diff/)

ComponentDescription
DiffReportGenerates rule-level and domain-level diff reports between two compilations. Outputs summary stats (added, removed, unchanged, % change).

Plugins (src/plugins/)

Extensibility system for custom transformations and downloaders.

flowchart TD
    PR["PluginRegistry\n← Global singleton"]
    PR -->|registers| P["Plugin\n{manifest, transforms, downloaders}"]
    P --> TPLG["TransformationPlugin"]
    P --> DPLG["DownloaderPlugin"]
ComponentDescription
PluginRegistryManages plugin lifecycle: load, init, register transformations, cleanup.
PluginDefines a manifest (name, version, author) + optional transformations and downloaders.
PluginTransformationWrapperWraps a TransformationPlugin function as a standard Transformation class.

Utilities (src/utils/)

Shared, reusable components used across all modules.

UtilityDescription
RuleUtilsRule classification: isComment(), isAdblockRule(), isHostsRule(), parseAdblockRule(), parseHostsRule().
StringUtilsString manipulation: trimming, splitting, normalization.
TldUtilsTLD validation and extraction.
WildcardGlob-style pattern matching (*, ?) compiled to regex.
CircuitBreakerThree-state circuit breaker (Closed → Open → Half-Open) for fault tolerance.
AsyncRetryRetry with exponential backoff and jitter.
ErrorUtilsTyped error hierarchy: BaseError, CompilationError, NetworkError, SourceError, ValidationError, ConfigurationError, FileSystemError.
CompilerEventEmitterType-safe event emission for compilation lifecycle.
BenchmarkCollectorPerformance timing and phase tracking.
BooleanExpressionParserParses !#if condition expressions.
AGTreeParserWraps @adguard/agtree for rule AST parsing.
ErrorReporterMulti-target error reporting (console, Cloudflare, Sentry, composite).
Logger / StructuredLoggerLeveled logging with module-specific overrides and JSON output.
checksumFilter list checksum computation.
PathUtilsSafe path resolution to prevent directory traversal.

CLI (src/cli/)

Command-line interface for local compilation.

ComponentDescription
CliAppMain CLI application; parses args, builds/overlays config, runs FilterCompiler, writes output (file, stdout, append).
ArgumentParserParses all CLI flags — transformation control, filtering, output modes, networking, and queue options. Validates via CliArgumentsSchema.
ConfigurationLoaderLoads and parses JSON configuration files.
OutputWriterWrites compiled rules to the file system.

See the CLI Reference for the full flag list and examples.

Deployment (src/deployment/)

ComponentDescription
version.tsTracks deployment history with records (version, build number, git commit, status) stored in D1.

Cloudflare Worker (worker/)

The edge deployment target that exposes the compiler as an HTTP/WebSocket API.

flowchart TD
    REQ["Incoming Request"]
    REQ --> W["worker.ts\n← Entry point (fetch, queue, scheduled)"]
    W --> R["router.ts\n(HTTP API)"]
    W --> WS["websocket.ts (WS)"]
    W --> QH["queue handler\n(async jobs)"]
    R --> HC["handlers/compile.ts"]
    R --> HM["handlers/metrics.ts"]
    R --> HQ["handlers/queue"]
    R --> HA["handlers/admin"]

API Endpoints

MethodPathHandlerDescription
POST/api/compilehandleCompileJsonSynchronous JSON compilation
POST/api/compile/streamhandleCompileStreamSSE streaming compilation
POST/api/compile/asynchandleCompileAsyncQueue-based async compilation
POST/api/compile/batchhandleCompileBatchBatch sync compilation
POST/api/compile/batch/asynchandleCompileBatchAsyncBatch async compilation
POST/api/ast/parsehandleASTParseRequestRule AST parsing
GET/api/versioninlineVersion info
GET/api/healthinlineHealth check
GET/api/metricshandleMetricsAggregated metrics
GET/api/queue/statshandleQueueStatsQueue statistics
GET/api/queue/results/:idhandleQueueResultsAsync job results
GET/wshandleWebSocketUpgradeWebSocket compilation

Admin Endpoints (require X-Admin-Key)

MethodPathHandler
GET/api/admin/storage/statsD1 storage statistics
POST/api/admin/storage/queryRaw SQL query
POST/api/admin/storage/clear-cacheClear cached data
POST/api/admin/storage/clear-expiredClean expired entries
GET/api/admin/storage/exportExport all data
POST/api/admin/storage/vacuumOptimize database
GET/api/admin/storage/tablesList D1 tables

Middleware Stack

flowchart LR
    REQ["Request"] --> RL["Rate Limit"]
    RL --> TS["Turnstile"]
    TS --> BS["Body Size"]
    BS --> AUTH["Auth"]
    AUTH --> H["Handler"]
    H --> RESP["Response"]
MiddlewareDescription
checkRateLimitKV-backed sliding window rate limiter (10 req/60s default)
verifyTurnstileTokenCloudflare Turnstile CAPTCHA verification
validateRequestSizePrevents DoS via oversized payloads (1MB default)
verifyAdminAuthAPI key authentication for admin endpoints

Durable Workflows

Long-running, crash-resistant compilation pipelines using Cloudflare Workflows:

WorkflowDescription
CompilationWorkflowFull compilation with step-by-step checkpointing: validate → fetch → transform → header → cache.
BatchCompilationWorkflowProcesses multiple compilations with progress tracking.
CacheWarmingWorkflowPre-compiles popular configurations to warm the cache.
HealthMonitoringWorkflowPeriodically checks source availability and health.

Environment Bindings

BindingTypePurpose
COMPILATION_CACHEKVCompiled rule caching
RATE_LIMITKVPer-IP rate limit tracking
METRICSKVEndpoint metrics aggregation
ADBLOCK_COMPILER_QUEUEQueueStandard priority async jobs
ADBLOCK_COMPILER_QUEUE_HIGH_PRIORITYQueueHigh priority async jobs
DBD1SQLite storage (admin, metadata)
ANALYTICS_ENGINEAnalytics EngineMetrics & analytics
ASSETSFetcherStatic web UI assets

Web UI (public/)

Static HTML/JS/CSS frontend served from Cloudflare Workers or Pages.

FileDescription
index.htmlMain landing page with documentation
compiler.htmlInteractive compilation UI with SSE streaming
admin-storage.htmlD1 storage administration dashboard
test.htmlAPI testing interface
validation-demo.htmlConfiguration validation demo
websocket-test.htmlWebSocket compilation testing
e2e-tests.htmlEnd-to-end test runner
js/theme.tsDark/light theme toggle (ESM module)
js/chart.tsChart.js configuration for metrics visualization

Cross-Cutting Concerns

Error Handling

flowchart TD
    BE["BaseError (abstract)"]
    BE --> CE["CompilationError\n— Compilation pipeline failures"]
    BE --> NE["NetworkError\n— HTTP/connection failures"]
    BE --> SE["SourceError\n— Source download/parse failures"]
    BE --> VE["ValidationError\n— Configuration/rule validation failures"]
    BE --> CFE["ConfigurationError\n— Invalid configuration"]
    BE --> FSE["FileSystemError\n— File system operation failures"]

Each error carries: code (ErrorCode enum), cause (original error), timestamp (ISO string).

Event System

The ICompilerEvents interface provides lifecycle hooks:

flowchart TD
    CS["Compilation Start"]
    CS --> OSS["onSourceStart\n(per source)"]
    CS --> OSC["onSourceComplete\n(per source, with rule count & duration)"]
    CS --> OSE["onSourceError\n(per source, with error)"]
    CS --> OTS["onTransformationStart\n(per transformation)"]
    CS --> OTC["onTransformationComplete\n(per transformation, with counts)"]
    CS --> OP["onProgress\n(phase, current/total, message)"]
    CS --> OCC["onCompilationComplete\n(total rules, duration, counts)"]

Logging

Two logger implementations:

LoggerUse Case
LoggerConsole-based, leveled (trace → error), with optional prefix
StructuredLoggerJSON output for log aggregation (CloudWatch, Datadog, Splunk)

Both implement ILogger (extends IDetailedLogger): info(), warn(), error(), debug(), trace().

Resilience Patterns

PatternImplementationUsed By
Circuit BreakerCircuitBreaker.ts (Closed → Open → Half-Open)FilterDownloader
Retry with BackoffAsyncRetry.ts (exponential + jitter)FilterDownloader
Rate LimitingKV-backed sliding windowWorker middleware
Request DeduplicationIn-memory Map<key, Promise>Worker compile handler

Data Flow Diagrams

CLI Compilation Flow

flowchart LR
    CFG["config.json"] --> CL["ConfigurationLoader"]
    FS["Filter Sources\n(HTTP/FS)"] --> FC
    CL --> FC["FilterCompiler"]
    FC --> SC["SourceCompiler\n(per src)"]
    FC --> TP["TransformationPipeline"]
    FC --> OUT["output.txt"]

Worker API Flow (SSE Streaming)

sequenceDiagram
    participant Client
    participant Worker
    participant Sources

    Client->>Worker: POST /api/compile/stream
    Worker->>Sources: Pre-fetch content
    Sources-->>Worker: content
    Note over Worker: WorkerCompiler.compile()
    Worker-->>Client: SSE: event: log
    Worker-->>Client: SSE: event: source-start
    Worker-->>Client: SSE: event: source-complete
    Worker-->>Client: SSE: event: progress
    Note over Worker: Cache result in KV
    Worker-->>Client: SSE: event: complete

Async Queue Flow

sequenceDiagram
    participant Client
    participant Worker
    participant Queue
    participant Consumer

    Client->>Worker: POST /compile/async
    Worker->>Queue: enqueue message
    Worker-->>Client: 202 {requestId}
    Queue->>Consumer: dequeue
    Consumer->>Consumer: compile
    Consumer->>Queue: store result
    Client->>Worker: GET /queue/results/:id
    Worker->>Queue: fetch result
    Worker-->>Client: 200 {rules}

Deployment Architecture

graph TD
    subgraph CFN["Cloudflare Edge Network"]
        subgraph CW["Cloudflare Worker (worker.ts)"]
            HAPI["HTTP API Router"]
            WSH["WebSocket Handler"]
            QC["Queue Consumer\n(async compile)"]
            DWF["Durable Workflows"]
            TW["Tail Worker"]
            SA["Static Assets\n(Pages/ASSETS)"]
        end
        KV["KV Store\n- Cache\n- Rates\n- Metrics"]
        D1["D1 (SQL)\n- Storage\n- Deploy\n- History"]
        QQ["Queues\n- Std\n- High"]
        AE["Analytics Engine"]
    end

    CLIENTS["Clients\n(Browser, CI/CD, CLI)"] -->|HTTP/SSE/WS| HAPI
    HAPI -->|HTTP fetch sources| FLS["Filter List Sources\n(EasyList, etc.)"]

Technology Stack

LayerTechnology
RuntimeDeno 2.6.7+
LanguageTypeScript (strict mode)
Package RegistryJSR (@jk-com/adblock-compiler)
Edge RuntimeCloudflare Workers
ValidationZod
Rule Parsing@adguard/agtree
ORMPrisma (optional, for local storage)
DatabaseSQLite (local), Cloudflare D1 (edge)
CachingCloudflare KV
QueueCloudflare Queues
AnalyticsCloudflare Analytics Engine
ObservabilityOpenTelemetry (optional), DiagnosticsCollector
UIStatic HTML + Tailwind CSS + Chart.js
CI/CDGitHub Actions
ContainerizationDocker + Docker Compose
FormattingDeno built-in formatter
TestingDeno built-in test framework + @std/assert

Adblock Compiler Benchmarks

This document describes the benchmark suite for the adblock-compiler project.

Overview

The benchmark suite covers the following areas:

  1. Utility Functions - Core utilities for rule parsing and manipulation

    • RuleUtils - Rule parsing, validation, and conversion
    • StringUtils - String manipulation operations
    • Wildcard - Pattern matching (plain, wildcard, regex)
  2. Transformations - Filter list transformation operations

    • DeduplicateTransformation - Remove duplicate rules
    • CompressTransformation - Convert and compress rules
    • RemoveCommentsTransformation - Strip comments
    • ValidateTransformation - Validate rule syntax
    • RemoveModifiersTransformation - Remove unsupported modifiers
    • TrimLinesTransformation - Trim whitespace
    • RemoveEmptyLinesTransformation - Remove empty lines
    • Chained transformations (real-world pipelines)

Running Benchmarks

Run All Benchmarks

deno bench --allow-read --allow-write --allow-net --allow-env

Run Specific Benchmark Files

# Utility benchmarks
deno bench src/utils/RuleUtils.bench.ts
deno bench src/utils/StringUtils.bench.ts
deno bench src/utils/Wildcard.bench.ts

# Transformation benchmarks
deno bench src/transformations/transformations.bench.ts

Run Benchmarks by Group

Deno allows filtering benchmarks by group name:

# Run only RuleUtils isComment benchmarks
deno bench --filter "isComment"

# Run only Deduplicate transformation benchmarks
deno bench --filter "deduplicate"

# Run only chained transformation benchmarks
deno bench --filter "chained"

Generate JSON Output

For CI/CD integration or further analysis:

deno bench --json > benchmark-results.json

Benchmark Structure

Each benchmark file follows this structure:

  • Setup - Sample data and configurations
  • Individual Operations - Test single operations with various inputs
  • Batch Operations - Test operations on multiple items
  • Real-world Scenarios - Test common usage patterns

Benchmark Groups

Benchmarks are organized into groups for easy filtering:

RuleUtils Groups

  • isComment - Comment detection
  • isAllowRule - Allow rule detection
  • isJustDomain - Domain validation
  • isEtcHostsRule - Hosts file detection
  • nonAscii - Non-ASCII character handling
  • punycode - Punycode conversion
  • parseTokens - Token parsing
  • extractHostname - Hostname extraction
  • loadEtcHosts - Hosts file parsing
  • loadAdblock - Adblock rule parsing
  • batch - Batch processing

StringUtils Groups

  • substringBetween - Substring extraction
  • split - Delimiter splitting with escapes
  • escapeRegExp - Regex escaping
  • isEmpty - Empty string checks
  • trim - Whitespace trimming
  • batch - Batch operations
  • realworld - Real-world usage

Wildcard Groups

  • creation - Pattern creation
  • plainMatch - Plain string matching
  • wildcardMatch - Wildcard pattern matching
  • regexMatch - Regex pattern matching
  • longStrings - Long string performance
  • properties - Property access
  • realworld - Filter list patterns
  • comparison - Pattern type comparison

Transformation Groups

  • deduplicate - Deduplication
  • compress - Compression
  • removeComments - Comment removal
  • validate - Validation
  • removeModifiers - Modifier removal
  • trimLines - Line trimming
  • removeEmptyLines - Empty line removal
  • chained - Chained transformations

Performance Tips

When analyzing benchmark results:

  1. Look for Regressions - Compare results across commits to catch performance regressions
  2. Focus on Hot Paths - Prioritize optimizing frequently-called operations
  3. Consider Trade-offs - Balance performance with code readability and maintainability
  4. Test with Real Data - Supplement benchmarks with real-world filter list data

CI/CD Integration

Add benchmarks to your CI pipeline:

# Example GitHub Actions
- name: Run Benchmarks
  run: deno bench --allow-read --allow-write --allow-net --allow-env --json > benchmarks.json

- name: Upload Results
  uses: actions/upload-artifact@v3
  with:
      name: benchmark-results
      path: benchmarks.json

Interpreting Results

Deno's benchmark output shows:

  • Time/iteration - Average time per benchmark iteration
  • Iterations - Number of iterations run
  • Standard deviation - Consistency of results

Lower times and smaller standard deviations indicate better performance.

Adding New Benchmarks

When adding new features, include benchmarks:

  1. Create or update the relevant .bench.ts file
  2. Follow existing naming conventions
  3. Use descriptive benchmark names
  4. Add to an appropriate group
  5. Include various input sizes (small, medium, large)
  6. Test edge cases

Example:

Deno.bench('MyComponent - operation description', { group: 'myGroup' }, () => {
    // Setup
    const component = new MyComponent();
    const input = generateTestData();

    // Benchmark
    component.process(input);
});

Baseline Expectations

Approximate performance baselines (your mileage may vary):

  • RuleUtils.isComment: ~100-500ns per call
  • RuleUtils.parseRuleTokens: ~1-5µs per call
  • Wildcard plain string match: ~50-200ns per call
  • Deduplicate 1000 rules: ~1-10ms
  • Compress 500 rules: ~5-20ms
  • Full pipeline 1000 rules: ~10-50ms

These are rough guidelines - actual performance depends on hardware, input data, and Deno version.

Circuit Breaker

The adblock-compiler includes a circuit breaker pattern for fault-tolerant filter list downloads. When a source URL fails repeatedly, the circuit breaker temporarily blocks requests to that URL, preventing cascading failures and wasted retries.

Overview

Each remote source URL gets its own circuit breaker that transitions through three states:

  1. CLOSED — Normal operation. Requests pass through. Consecutive failures are counted.
  2. OPEN — Failure threshold reached. All requests are immediately rejected. When using the CircuitBreaker directly this surfaces as a CircuitBreakerOpenError; when using FilterDownloader, the open breaker is exposed as a NetworkError. After a timeout period the breaker moves to HALF_OPEN.
  3. HALF_OPEN — Recovery probe. The next request is allowed through. If it succeeds the breaker returns to CLOSED; if it fails the breaker reopens.
stateDiagram-v2
    [*] --> CLOSED
    CLOSED --> CLOSED : success
    CLOSED --> OPEN : threshold reached (failure)
    OPEN --> HALF_OPEN : timeout elapsed
    HALF_OPEN --> CLOSED : success
    HALF_OPEN --> OPEN : failure

Default Configuration

Circuit breaker settings are defined in src/config/defaults.ts under NETWORK_DEFAULTS:

SettingDefaultDescription
CIRCUIT_BREAKER_THRESHOLD5Consecutive failures before opening the circuit
CIRCUIT_BREAKER_TIMEOUT_MS60000 (60 s)Time to wait before attempting recovery

Usage with FilterDownloader

The circuit breaker is enabled by default in FilterDownloader. Each URL automatically gets its own breaker instance.

import { FilterDownloader } from '@jk-com/adblock-compiler';

// Defaults: threshold=5, timeout=60s, enabled=true
const downloader = new FilterDownloader();

// Override circuit breaker settings
const customDownloader = new FilterDownloader({
    enableCircuitBreaker: true,
    circuitBreakerThreshold: 3,    // open after 3 failures
    circuitBreakerTimeout: 120000, // wait 2 minutes before recovery
});

const rules = await customDownloader.download('https://example.com/filters.txt');

Disabling the Circuit Breaker

const downloader = new FilterDownloader({
    enableCircuitBreaker: false,
});

Standalone Usage

You can also use CircuitBreaker directly to protect any async operation:

import { CircuitBreaker, CircuitBreakerOpenError } from '@jk-com/adblock-compiler';

const breaker = new CircuitBreaker({
    threshold: 5,
    timeout: 60000,
    name: 'my-service',
});

try {
    const result = await breaker.execute(() => fetch('https://api.example.com/data'));
    console.log('Success:', result.status);
} catch (error) {
    if (error instanceof CircuitBreakerOpenError) {
        console.log('Circuit is open — skipping request');
    } else {
        console.error('Request failed:', error.message);
    }
}

Inspecting State

// Current state: CLOSED, OPEN, or HALF_OPEN
console.log(breaker.getState());

// Full statistics
const stats = breaker.getStats();
// {
//   state: 'CLOSED',
//   failureCount: 2,
//   threshold: 5,
//   timeout: 60000,
//   lastFailureTime: undefined,
//   timeUntilRecovery: 0,
// }

Manual Reset

breaker.reset(); // Force back to CLOSED, clear failure count

Troubleshooting

"Circuit breaker is OPEN. Retry in Xs"

This means a source URL has exceeded the failure threshold. Options:

  1. Wait for the timeout to elapse — the breaker will automatically move to HALF_OPEN and attempt recovery.
  2. Check the source URL — verify it is reachable and returning valid content.
  3. Increase the threshold if the source is known to be intermittent:
const downloader = new FilterDownloader({
    circuitBreakerThreshold: 10, // tolerate more failures
});

Source permanently failing

If a source is permanently unavailable, the circuit breaker will continue cycling between OPEN and HALF_OPEN. Consider removing or disabling the source in your sources configuration. If you only need to exclude specific rules from an otherwise healthy source, use exclusions_sources to point to files containing rule exclusion patterns.

Adblock Compiler - Code Review

Date: 2026-01-13 Version Reviewed: 0.7.18 Reviewer: Comprehensive Code Review


Executive Summary

The adblock-compiler is a well-architected Deno-native project with solid fundamentals. The codebase demonstrates excellent separation of concerns, comprehensive type definitions, and multi-platform support. This review has verified code quality, addressed critical issues, and confirmed the codebase is well-organized with consistent patterns throughout.

Overall Assessment: EXCELLENT

The codebase is production-ready with:

  • Clean architecture and well-defined module boundaries
  • Comprehensive test coverage (41 test files co-located with 88 source files)
  • Centralized configuration and constants
  • Consistent error handling patterns
  • Well-documented API with extensive markdown documentation

Recent Improvements (2026-01-13)

✅ Version Synchronization - FIXED

Location: src/version.ts, src/plugins/PluginSystem.ts

Issue: Hardcoded version 0.6.91 in PluginSystem.ts was out of sync with actual version 0.7.18.

Resolution: Updated to use centralized VERSION constant from src/version.ts.

// Before: Hardcoded
compilerVersion: '0.6.91';

// After: Using constant
import { VERSION } from '../version.ts';
compilerVersion: VERSION;

✅ Magic Numbers Centralization - FIXED

Location: src/downloader/ContentFetcher.ts, worker/worker.ts

Issue: Hardcoded timeout values and rate limit constants.

Resolution: Now using centralized constants from src/config/defaults.ts.

// ContentFetcher.ts - Before
timeout: 30000; // Hardcoded

// ContentFetcher.ts - After
import { NETWORK_DEFAULTS } from '../config/defaults.ts';
timeout: NETWORK_DEFAULTS.TIMEOUT_MS;

// worker.ts - Before
const RATE_LIMIT_WINDOW = 60;
const RATE_LIMIT_MAX_REQUESTS = 10;
const CACHE_TTL = 3600;

// worker.ts - After
import { WORKER_DEFAULTS } from '../src/config/defaults.ts';
const RATE_LIMIT_WINDOW = WORKER_DEFAULTS.RATE_LIMIT_WINDOW_SECONDS;
const RATE_LIMIT_MAX_REQUESTS = WORKER_DEFAULTS.RATE_LIMIT_MAX_REQUESTS;
const CACHE_TTL = WORKER_DEFAULTS.CACHE_TTL_SECONDS;

✅ Documentation Fixes - COMPLETED

Files Updated:

  • README.md - Fixed "are are" typo, added missing ConvertToAscii transformation
  • .github/copilot-instructions.md - Updated line width (100 → 180) to match deno.json
  • CODE_REVIEW.md - Updated date and version to reflect current state

Part A: Code Quality Assessment

1. Architecture and Organization ✅ EXCELLENT

Structure:

src/
├── cli/              # Command-line interface
├── compiler/         # Core compilation logic (FilterCompiler, SourceCompiler)
├── config/           # ✅ Centralized configuration defaults
├── configuration/    # Configuration validation
├── diagnostics/      # Event emission and tracing
├── diff/             # Diff report generation
├── downloader/       # Filter list downloading and fetching
├── formatters/       # Output format converters
├── platform/         # Platform abstraction (WorkerCompiler)
├── plugins/          # Plugin system
├── services/         # High-level services
├── storage/          # Storage abstractions
├── transformations/  # Rule transformation implementations
├── types/            # TypeScript type definitions
├── utils/            # Utility functions and helpers
└── version.ts        # ✅ Centralized version management

Metrics:

  • 88 source files (excluding tests)
  • 41 test files (co-located with source)
  • 47% test coverage ratio
  • Clear module boundaries with barrel exports

2. Code Duplication ✅ MINIMAL

HeaderGenerator Abstraction:

Both FilterCompiler and WorkerCompiler properly use the HeaderGenerator utility class. No significant duplication exists.

// Both compilers use thin wrapper methods
private prepareHeader(configuration: IConfiguration): string[] {
    return this.headerGenerator.generateListHeader(configuration);
}

private prepareSourceHeader(source: ISource): string[] {
    return this.headerGenerator.generateSourceHeader(source);
}

Assessment: This is an acceptable pattern - thin wrappers maintain encapsulation while delegating to shared utilities.


3. Constants and Configuration ✅ EXCELLENT

Centralized in src/config/defaults.ts:

export const NETWORK_DEFAULTS = {
    MAX_REDIRECTS: 5,
    TIMEOUT_MS: 30_000,
    MAX_RETRIES: 3,
    RETRY_DELAY_MS: 1_000,
    RETRY_JITTER_PERCENT: 0.3,
} as const;

export const WORKER_DEFAULTS = {
    RATE_LIMIT_WINDOW_SECONDS: 60,
    RATE_LIMIT_MAX_REQUESTS: 10,
    CACHE_TTL_SECONDS: 3600,
    METRICS_WINDOW_SECONDS: 300,
    MAX_BATCH_REQUESTS: 10,
} as const;

export const COMPILATION_DEFAULTS = { ... }
export const STORAGE_DEFAULTS = { ... }
export const VALIDATION_DEFAULTS = { ... }
export const PREPROCESSOR_DEFAULTS = { ... }

Usage:

  • All magic numbers have been eliminated
  • Constants are well-documented with JSDoc comments
  • Values are typed as const for immutability
  • Organized by functional area

4. Error Handling ✅ CONSISTENT

Centralized Pattern via ErrorUtils:

// src/utils/ErrorUtils.ts
export class ErrorUtils {
    static getMessage(error: unknown): string {
        return error instanceof Error ? error.message : String(error);
    }

    static wrap(error: unknown, context: string): Error {
        return new Error(`${context}: ${this.getMessage(error)}`);
    }
}

Usage Statistics:

  • 46 direct pattern instances: error instanceof Error ? error.message : String(error)
  • 4 instances using ErrorUtils.getMessage()
  • Consistent approach across all modules

Custom Error Classes:

  • CompilationError
  • ConfigurationError
  • FileSystemError
  • NetworkError
  • SourceError
  • StorageError
  • TransformationError
  • ValidationError

All extend BaseError with proper error codes and context.


5. Import Organization ✅ EXCELLENT

Pattern:

  • All modules use barrel exports via index.ts files
  • Main entry point src/index.ts exports all public APIs
  • Uses Deno import map aliases (@std/path, @std/assert)
  • Explicit .ts extensions for relative imports (Deno requirement)
  • Type-only imports use import type where possible

Example:

// Good - using barrel export
import { ConfigurationValidator } from '../configuration/index.ts';

// Good - using import map alias
import { join } from '@std/path';

// Good - type-only import
import type { IConfiguration } from '../types/index.ts';

6. TypeScript Strictness ✅ EXCELLENT

Configuration in deno.json:

{
    "compilerOptions": {
        "strict": true,
        "noImplicitAny": true,
        "strictNullChecks": true,
        "noUnusedLocals": true,
        "noUnusedParameters": true
    }
}

Observations:

  • All strict TypeScript options enabled
  • No use of any types (per coding guidelines)
  • Consistent use of readonly for immutable arrays
  • Interfaces use I prefix (e.g., IConfiguration, ILogger)

7. Documentation ✅ EXCELLENT

Markdown Files:

  • README.md (1142 lines) - Comprehensive project documentation
  • CODE_REVIEW.md (642 lines) - This file
  • docs/EXTENSIBILITY.md (749 lines) - Extensibility guide
  • docs/TROUBLESHOOTING.md (677 lines) - Troubleshooting guide
  • docs/QUEUE_SUPPORT.md (639 lines) - Queue integration
  • docs/api/README.md (447 lines) - API documentation
  • Plus 12 more documentation files

JSDoc Coverage:

  • All public APIs have JSDoc comments
  • Interfaces are well-documented
  • Parameters and return types documented
  • Examples provided for complex APIs

8. Testing ✅ GOOD

Test Structure:

  • Tests co-located with source files (*.test.ts)
  • 41 test files across the codebase
  • Uses Deno's built-in test framework
  • Assertions use @std/assert

Example Test Files:

  • src/transformations/DeduplicateTransformation.test.ts
  • src/compiler/HeaderGenerator.test.ts
  • src/utils/RuleUtils.test.ts
  • worker/queue.integration.test.ts

Test Commands:

deno task test           # Run all tests
deno task test:watch     # Watch mode
deno task test:coverage  # With coverage

9. Security ✅ ADDRESSED

Function Constructor Issue:

The CODE_REVIEW.md identified unsafe use of new Function() in FilterDownloader.ts.

Status: The codebase now has a safe Boolean expression parser:

// src/utils/BooleanExpressionParser.ts
export function evaluateBooleanExpression(expression: string, platform?: string): boolean {
    // Safe tokenization and evaluation without Function constructor
}

Exported from main API:

export { evaluateBooleanExpression, getKnownPlatforms, isKnownPlatform } from './utils/index.ts';

Part B: Suggested Future Enhancements

The following are recommendations from the original CODE_REVIEW.md that could add value:

High Priority Features

  1. Incremental Compilation - Already implemented! ✅

    • IncrementalCompiler exists in src/compiler/IncrementalCompiler.ts
    • Supports cache storage and differential updates
  2. Conflict Detection - Already implemented! ✅

    • ConflictDetectionTransformation exists in src/transformations/ConflictDetectionTransformation.ts
    • Detects blocking vs. allowing rule conflicts
  3. Diff Report Generation - Already implemented! ✅

    • DiffGenerator exists in src/diff/index.ts
    • Supports markdown output

Medium Priority Features

  1. Rule Optimizer - Already implemented! ✅

    • RuleOptimizerTransformation exists in src/transformations/RuleOptimizerTransformation.ts
  2. Multiple Output Formats - Already implemented! ✅

    • src/formatters/ includes:
      • AdblockFormatter
      • HostsFormatter
      • DnsmasqFormatter
      • PiHoleFormatter
      • DoHFormatter
      • UnboundFormatter
      • JsonFormatter
  3. Plugin System - Already implemented! ✅

    • src/plugins/ includes full plugin architecture
    • Support for custom transformations and downloaders

Potential Future Additions

  1. Source Health Monitoring Dashboard

    • Web UI dashboard showing source availability and health trends
    • Historical availability charts
    • Response time tracking
  2. Scheduled Compilation (Cron-like)

    • Built-in scheduling for automatic recompilation
    • Webhook notifications on completion
    • Auto-deploy to CDN/storage
  3. DNS Lookup Validation

    • Validate that blocked domains actually resolve
    • Remove dead domains to reduce list size

Summary

Current Status: PRODUCTION-READY ✅

The adblock-compiler codebase is:

Well-Architected - Clean separation of concerns with logical module boundaries
Well-Documented - Comprehensive markdown docs and JSDoc coverage
Well-Tested - 41 test files co-located with source
Type-Safe - Strict TypeScript with no any types
Maintainable - Centralized configuration, consistent patterns
Extensible - Plugin system and platform abstraction layer
Feature-Rich - Incremental compilation, conflict detection, multiple output formats

Recent Fixes (2026-01-13)

✅ Version synchronization (PluginSystem.ts)
✅ Magic numbers centralization (ContentFetcher.ts, worker.ts)
✅ Documentation updates (README.md, copilot-instructions.md)
✅ Code review document updates

Recommendations

No Critical Issues Remain

Minor Suggestions:

  • Continue adding tests for edge cases
  • Consider adding benchmark comparisons to track performance over time
  • Potentially add integration tests for the complete Worker deployment

Overall: The codebase demonstrates excellent software engineering practices and is ready for continued production use and feature development.


This code review reflects the state of the codebase as of 2026-01-13 at version 0.7.18.

Diagnostics and Tracing System

The adblock-compiler includes a comprehensive diagnostics and tracing system that emits structured events throughout the compilation pipeline. These events can be captured by the Cloudflare Tail Worker for monitoring, debugging, and observability.

Overview

The diagnostics system provides:

  • Structured Event Emission: All operations emit standardized diagnostic events
  • Operation Tracing: Track the start, completion, and errors of operations
  • Performance Metrics: Record timing and resource usage metrics
  • Cache Events: Monitor cache hits, misses, and operations
  • Network Events: Track HTTP requests with timing and status codes
  • Error Tracking: Capture errors with full context and stack traces
  • Correlation IDs: Group related events across the compilation pipeline

Architecture

The system consists of three main components:

  1. DiagnosticsCollector: Aggregates and stores diagnostic events
  2. TracingContext: Provides context for operations through the pipeline
  3. Event Types: Structured event definitions for different categories

Basic Usage

Creating a Tracing Context

import { createTracingContext } from '@jk-com/adblock-compiler';

const tracingContext = createTracingContext({
    metadata: {
        userId: 'user123',
        requestId: 'req456',
    },
});

Using with FilterCompiler

import { createTracingContext, FilterCompiler } from '@jk-com/adblock-compiler';

const tracingContext = createTracingContext();

const compiler = new FilterCompiler({
    tracingContext,
});

const result = await compiler.compileWithMetrics(configuration, true);

// Access diagnostic events
const diagnostics = result.diagnostics;
console.log(`Collected ${diagnostics.length} diagnostic events`);

Using with WorkerCompiler

import { createTracingContext, WorkerCompiler } from '@jk-com/adblock-compiler';

const tracingContext = createTracingContext();

const compiler = new WorkerCompiler({
    preFetchedContent: sources,
    tracingContext,
});

const result = await compiler.compileWithMetrics(configuration);

// Diagnostics are included in the result
if (result.diagnostics) {
    for (const event of result.diagnostics) {
        console.log(`[${event.category}] ${event.message}`);
    }
}

Event Types

Operation Events

Track the lifecycle of operations:

// Operation Start
{
    eventId: "evt-123",
    timestamp: "2024-01-12T00:00:00.000Z",
    category: "compilation",
    severity: "debug",
    message: "Operation started: compileFilterList",
    correlationId: "trace-456",
    operation: "compileFilterList",
    input: {
        name: "My Filter List",
        sourceCount: 3
    }
}

// Operation Complete
{
    eventId: "evt-124",
    timestamp: "2024-01-12T00:00:01.234Z",
    category: "compilation",
    severity: "info",
    message: "Operation completed: compileFilterList (1234.56ms)",
    correlationId: "trace-456",
    operation: "compileFilterList",
    durationMs: 1234.56,
    output: {
        ruleCount: 5000
    }
}

// Operation Error
{
    eventId: "evt-125",
    timestamp: "2024-01-12T00:00:00.500Z",
    category: "error",
    severity: "error",
    message: "Operation failed: downloadSource - Network error",
    correlationId: "trace-456",
    operation: "downloadSource",
    errorType: "NetworkError",
    errorMessage: "Failed to fetch source",
    stack: "...",
    durationMs: 500
}

Performance Metrics

Record performance measurements:

{
    eventId: "evt-126",
    timestamp: "2024-01-12T00:00:01.000Z",
    category: "performance",
    severity: "debug",
    message: "Metric: inputRuleCount = 10000 rules",
    correlationId: "trace-456",
    metric: "inputRuleCount",
    value: 10000,
    unit: "rules",
    dimensions: {
        source: "my-source"
    }
}

Cache Events

Monitor cache operations:

{
    eventId: "evt-127",
    timestamp: "2024-01-12T00:00:00.100Z",
    category: "cache",
    severity: "debug",
    message: "Cache hit: cache-key-abc (1024 bytes)",
    correlationId: "trace-456",
    operation: "hit",
    key: "cache-key-abc",
    size: 1024
}

Network Events

Track HTTP requests:

{
    eventId: "evt-128",
    timestamp: "2024-01-12T00:00:00.200Z",
    category: "network",
    severity: "debug",
    message: "GET https://example.com/filters.txt - 200 (234.56ms)",
    correlationId: "trace-456",
    method: "GET",
    url: "https://example.com/filters.txt",
    statusCode: 200,
    durationMs: 234.56,
    responseSize: 50000
}

Tail Worker Integration

The diagnostics events are automatically emitted to console in the Cloudflare Worker, where they can be captured by the Tail Worker.

Event Emission

In worker/worker.ts, diagnostic events are emitted using severity-appropriate console methods:

function emitDiagnosticsToTailWorker(diagnostics: DiagnosticEvent[]): void {
    for (const event of diagnostics) {
        const logData = {
            ...event,
            source: 'adblock-compiler',
        };

        switch (event.severity) {
            case 'error':
                console.error('[DIAGNOSTIC]', JSON.stringify(logData));
                break;
            case 'warn':
                console.warn('[DIAGNOSTIC]', JSON.stringify(logData));
                break;
            case 'info':
                console.info('[DIAGNOSTIC]', JSON.stringify(logData));
                break;
            default:
                console.debug('[DIAGNOSTIC]', JSON.stringify(logData));
        }
    }
}

Tail Worker Consumption

The Tail Worker receives these events and can process them:

// In worker/tail.ts
export default {
    async tail(events: TailEvent[], env: TailEnv, ctx: ExecutionContext) {
        for (const event of events) {
            // Filter for diagnostic events
            const diagnosticLogs = event.logs.filter((log) => log.message.some((m) => typeof m === 'string' && m.includes('[DIAGNOSTIC]')));

            for (const log of diagnosticLogs) {
                // Parse and process diagnostic event
                const diagnostic = JSON.parse(log.message[1]);

                // Store in KV, forward to webhook, etc.
                if (env.TAIL_LOGS) {
                    await env.TAIL_LOGS.put(
                        `diagnostic:${diagnostic.eventId}`,
                        JSON.stringify(diagnostic),
                        { expirationTtl: 86400 },
                    );
                }
            }
        }
    },
};

Advanced Features

Manual Tracing

For custom operations, use the tracing utilities:

import { createTracingContext, traceAsync, traceSync } from '@jk-com/adblock-compiler';

const context = createTracingContext();

// Trace synchronous operation
const result = traceSync(context, 'myOperation', () => {
    // Your code here
    return processData();
}, { inputSize: 1000 });

// Trace asynchronous operation
const result = await traceAsync(context, 'myAsyncOperation', async () => {
    // Your async code here
    return await fetchData();
}, { url: 'https://example.com' });

Child Contexts

Create child contexts for nested operations:

import { createChildContext } from '@jk-com/adblock-compiler';

const parentContext = createTracingContext({
    metadata: { requestId: '123' },
});

const childContext = createChildContext(parentContext, {
    operationName: 'downloadSource',
});

// Child context inherits correlation ID and parent metadata

Filtering Events

Filter events by category or severity:

const diagnostics = context.diagnostics.getEvents();

// Filter by category
const networkEvents = diagnostics.filter((e) => e.category === 'network');

// Filter by severity
const errors = diagnostics.filter((e) => e.severity === 'error');

// Filter by correlation ID
const relatedEvents = diagnostics.filter((e) => e.correlationId === 'trace-123');

Best Practices

  1. Always use tracing contexts: Pass tracing contexts through your compilation pipeline
  2. Use correlation IDs: Group related events with correlation IDs
  3. Include metadata: Add relevant metadata to contexts for better debugging
  4. Monitor performance metrics: Track key metrics like rule counts and durations
  5. Handle errors properly: Ensure errors are captured in diagnostic events
  6. Clean up contexts: Clear diagnostic events when appropriate to prevent memory leaks

Examples

See worker/worker.ts for complete examples of integrating diagnostics into the Cloudflare Worker.

API Reference

createTracingContext(options?)

Creates a new tracing context.

Parameters:

  • options.correlationId?: Custom correlation ID
  • options.parent?: Parent tracing context
  • options.metadata?: Custom metadata object
  • options.diagnostics?: Custom diagnostics collector

Returns: TracingContext

DiagnosticsCollector

Collects and stores diagnostic events.

Methods:

  • operationStart(operation, input?): Start tracking an operation
  • operationComplete(eventId, output?): Mark operation as complete
  • operationError(eventId, error): Record an operation error
  • recordMetric(metric, value, unit, dimensions?): Record a performance metric
  • recordCacheEvent(operation, key, size?): Record a cache operation
  • recordNetworkEvent(method, url, statusCode?, durationMs?, responseSize?): Record a network request
  • emit(event): Emit a custom diagnostic event
  • getEvents(): Get all collected events
  • clear(): Clear all events

Troubleshooting

Events not appearing in tail worker

  1. Ensure the main worker has tail_consumers configured in wrangler.toml
  2. Verify diagnostic events are being emitted with console.log/error/etc
  3. Check tail worker is deployed and running

Too many events

  1. Use the NoOpDiagnosticsCollector for operations that don't need tracing
  2. Filter events by severity or category before storing
  3. Implement sampling to capture only a percentage of events

Performance impact

The diagnostics system is designed to be lightweight, but for high-throughput scenarios:

  1. Use createNoOpContext() to disable diagnostics entirely
  2. Sample diagnostic collection (e.g., 1 in 100 requests)
  3. Clear events periodically with diagnostics.clear()

Extensibility Guide

AdBlock Compiler is designed to be fully extensible. This guide shows you how to extend the compiler with custom transformations, fetchers, and more.

Table of Contents

Custom Transformations

Create custom transformations by extending the base Transformation classes.

Synchronous Transformation

For transformations that don't require async operations:

import { ITransformationContext, SyncTransformation, TransformationType } from '@jk-com/adblock-compiler';

// Custom transformation to add custom headers
class AddHeaderTransformation extends SyncTransformation {
    public readonly type = 'AddHeader' as TransformationType;
    public readonly name = 'Add Header';

    private header: string;

    constructor(header: string, logger?) {
        super(logger);
        this.header = header;
    }

    public executeSync(rules: string[], context?: ITransformationContext): string[] {
        this.info(`Adding custom header: ${this.header}`);
        return [this.header, ...rules];
    }
}

// Usage
const transformation = new AddHeaderTransformation('! Custom Filter List v1.0.0');
const result = await transformation.execute(rules);

Asynchronous Transformation

For transformations that fetch external data or perform async operations:

import { AsyncTransformation, ITransformationContext, TransformationType } from '@jk-com/adblock-compiler';

// Custom transformation to fetch and merge remote rules
class MergeRemoteRulesTransformation extends AsyncTransformation {
    public readonly type = 'MergeRemoteRules' as TransformationType;
    public readonly name = 'Merge Remote Rules';

    private remoteUrl: string;

    constructor(remoteUrl: string, logger?) {
        super(logger);
        this.remoteUrl = remoteUrl;
    }

    public async execute(rules: string[], context?: ITransformationContext): Promise<string[]> {
        this.info(`Fetching remote rules from: ${this.remoteUrl}`);

        try {
            const response = await fetch(this.remoteUrl);
            const remoteRules = (await response.text()).split('\n');

            this.info(`Merged ${remoteRules.length} remote rules`);
            return [...rules, ...remoteRules];
        } catch (error) {
            this.error(`Failed to fetch remote rules: ${error.message}`);
            return rules; // Return original rules on failure
        }
    }
}

// Usage
const transformation = new MergeRemoteRulesTransformation('https://example.com/extra-rules.txt');
const result = await transformation.execute(rules);

Advanced Transformation with Context

Access configuration and logger from context:

import { ITransformationContext, RuleUtils, SyncTransformation, TransformationType } from '@jk-com/adblock-compiler';

class SmartDeduplicateTransformation extends SyncTransformation {
    public readonly type = 'SmartDeduplicate' as TransformationType;
    public readonly name = 'Smart Deduplicate';

    public executeSync(rules: string[], context?: ITransformationContext): string[] {
        const config = context?.configuration;
        const logger = context?.logger || this.logger;

        logger.info('Starting smart deduplication...');

        // Group rules by type
        const allowRules: string[] = [];
        const blockRules: string[] = [];
        const comments: string[] = [];

        for (const rule of rules) {
            if (RuleUtils.isComment(rule)) {
                comments.push(rule);
            } else if (RuleUtils.isAllowRule(rule)) {
                allowRules.push(rule);
            } else {
                blockRules.push(rule);
            }
        }

        // Deduplicate each group
        const dedupedAllowRules = [...new Set(allowRules)];
        const dedupedBlockRules = [...new Set(blockRules)];
        const dedupedComments = [...new Set(comments)];

        logger.info(`Deduplicated: ${allowRules.length} → ${dedupedAllowRules.length} allow rules`);
        logger.info(`Deduplicated: ${blockRules.length} → ${dedupedBlockRules.length} block rules`);

        // Combine: comments first, then allow rules, then block rules
        return [...dedupedComments, ...dedupedAllowRules, ...dedupedBlockRules];
    }
}

Registering Custom Transformations

import { FilterCompiler, TransformationPipeline, TransformationRegistry } from '@jk-com/adblock-compiler';

// Create custom registry
const registry = new TransformationRegistry();

// Register custom transformations
registry.register('AddHeader' as any, new AddHeaderTransformation('! My Header'));
registry.register('SmartDeduplicate' as any, new SmartDeduplicateTransformation());

// Use custom registry in pipeline
const pipeline = new TransformationPipeline(registry);

// Or use with FilterCompiler
const compiler = new FilterCompiler({ transformationRegistry: registry });

Custom Fetchers

Implement custom content fetchers for different protocols or sources:

import { IContentFetcher, PreFetchedContent } from '@jk-com/adblock-compiler';

// Custom fetcher for FTP protocol
class FtpFetcher implements IContentFetcher {
    async canHandle(source: string): Promise<boolean> {
        return source.startsWith('ftp://');
    }

    async fetchContent(source: string): Promise<string> {
        // Your FTP client implementation
        console.log(`Fetching from FTP: ${source}`);

        // Example: use a Deno FTP library
        // const client = new FTPClient();
        // await client.connect(host, port);
        // const content = await client.download(path);
        // await client.close();
        // return content;

        throw new Error('FTP fetcher not implemented');
    }
}

// Custom fetcher for database sources
class DatabaseFetcher implements IContentFetcher {
    private connectionString: string;

    constructor(connectionString: string) {
        this.connectionString = connectionString;
    }

    async canHandle(source: string): Promise<boolean> {
        return source.startsWith('db://');
    }

    async fetchContent(source: string): Promise<string> {
        // Parse source: db://table/column
        const [table, column] = source.replace('db://', '').split('/');

        console.log(`Fetching from database: ${table}.${column}`);

        // Your database query implementation
        // const db = await connect(this.connectionString);
        // const result = await db.query(`SELECT ${column} FROM ${table}`);
        // return result.rows.map(row => row[column]).join('\n');

        throw new Error('Database fetcher not implemented');
    }
}

// Usage with CompositeFetcher
import { CompositeFetcher, HttpFetcher, PreFetchedContentFetcher } from '@jk-com/adblock-compiler';

const fetcher = new CompositeFetcher([
    new HttpFetcher(),
    new FtpFetcher(),
    new DatabaseFetcher('postgresql://localhost/filters'),
    new PreFetchedContentFetcher(preFetchedContent),
]);

// Use with PlatformDownloader
import { PlatformDownloader } from '@jk-com/adblock-compiler';

const downloader = new PlatformDownloader({ fetcher });
const content = await downloader.download('ftp://example.com/filters.txt');

Custom Event Handlers

Implement custom event tracking and monitoring:

import { CompilerEventEmitter, ICompilerEvents } from '@jk-com/adblock-compiler';

// Custom event handler that sends metrics to external service
class MetricsEventHandler implements ICompilerEvents {
    private metricsEndpoint: string;

    constructor(metricsEndpoint: string) {
        this.metricsEndpoint = metricsEndpoint;
    }

    onSourceStart(event: any): void {
        console.log(`[SOURCE START] ${event.source.name}`);
        this.sendMetric('source.start', {
            sourceName: event.source.name,
            timestamp: Date.now(),
        });
    }

    onSourceComplete(event: any): void {
        console.log(`[SOURCE COMPLETE] ${event.source.name}: ${event.ruleCount} rules`);
        this.sendMetric('source.complete', {
            sourceName: event.source.name,
            ruleCount: event.ruleCount,
            durationMs: event.durationMs,
        });
    }

    onSourceError(event: any): void {
        console.error(`[SOURCE ERROR] ${event.source.name}: ${event.error.message}`);
        this.sendMetric('source.error', {
            sourceName: event.source.name,
            error: event.error.message,
        });
    }

    onTransformationStart(event: any): void {
        console.log(`[TRANSFORM START] ${event.name}`);
    }

    onTransformationComplete(event: any): void {
        console.log(`[TRANSFORM COMPLETE] ${event.name}: ${event.inputCount} → ${event.outputCount}`);
        this.sendMetric('transformation.complete', {
            name: event.name,
            inputCount: event.inputCount,
            outputCount: event.outputCount,
            durationMs: event.durationMs,
        });
    }

    onTransformationError(event: any): void {
        console.error(`[TRANSFORM ERROR] ${event.name}: ${event.error.message}`);
    }

    onProgress(event: any): void {
        console.log(`[PROGRESS] ${event.phase}: ${event.current}/${event.total}`);
    }

    onCompilationComplete(event: any): void {
        console.log(`[COMPILATION COMPLETE] ${event.ruleCount} rules`);
        this.sendMetric('compilation.complete', {
            ruleCount: event.ruleCount,
            sourceCount: event.sourceCount,
            totalDurationMs: event.totalDurationMs,
        });
    }

    private async sendMetric(eventType: string, data: any): Promise<void> {
        try {
            await fetch(this.metricsEndpoint, {
                method: 'POST',
                headers: { 'Content-Type': 'application/json' },
                body: JSON.stringify({ eventType, data, timestamp: Date.now() }),
            });
        } catch (error) {
            console.error(`Failed to send metric: ${error.message}`);
        }
    }
}

// Usage
const metricsHandler = new MetricsEventHandler('https://metrics.example.com/events');

import { WorkerCompiler } from '@jk-com/adblock-compiler';
const compiler = new WorkerCompiler({
    events: metricsHandler,
});

Custom Loggers

Implement custom logging to integrate with your logging system:

import { ILogger } from '@jk-com/adblock-compiler';

// Custom logger that sends logs to external service
class RemoteLogger implements ILogger {
    private logEndpoint: string;
    private minLevel: 'debug' | 'info' | 'warn' | 'error';

    constructor(logEndpoint: string, minLevel = 'info') {
        this.logEndpoint = logEndpoint;
        this.minLevel = minLevel;
    }

    debug(message: string): void {
        if (this.shouldLog('debug')) {
            console.debug(`[DEBUG] ${message}`);
            this.send('debug', message);
        }
    }

    info(message: string): void {
        if (this.shouldLog('info')) {
            console.info(`[INFO] ${message}`);
            this.send('info', message);
        }
    }

    warn(message: string): void {
        if (this.shouldLog('warn')) {
            console.warn(`[WARN] ${message}`);
            this.send('warn', message);
        }
    }

    error(message: string): void {
        if (this.shouldLog('error')) {
            console.error(`[ERROR] ${message}`);
            this.send('error', message);
        }
    }

    private shouldLog(level: string): boolean {
        const levels = ['debug', 'info', 'warn', 'error'];
        return levels.indexOf(level) >= levels.indexOf(this.minLevel);
    }

    private async send(level: string, message: string): Promise<void> {
        try {
            await fetch(this.logEndpoint, {
                method: 'POST',
                headers: { 'Content-Type': 'application/json' },
                body: JSON.stringify({ level, message, timestamp: Date.now() }),
            });
        } catch (error) {
            // Don't log errors from logger itself
        }
    }
}

// Structured logger with context
class StructuredLogger implements ILogger {
    private context: Record<string, any>;

    constructor(context: Record<string, any> = {}) {
        this.context = context;
    }

    debug(message: string): void {
        this.log('DEBUG', message);
    }

    info(message: string): void {
        this.log('INFO', message);
    }

    warn(message: string): void {
        this.log('WARN', message);
    }

    error(message: string): void {
        this.log('ERROR', message);
    }

    private log(level: string, message: string): void {
        const logEntry = {
            timestamp: new Date().toISOString(),
            level,
            message,
            ...this.context,
        };
        console.log(JSON.stringify(logEntry));
    }

    withContext(additionalContext: Record<string, any>): StructuredLogger {
        return new StructuredLogger({ ...this.context, ...additionalContext });
    }
}

// Usage
const logger = new StructuredLogger({ service: 'adblock-compiler', version: '2.0.0' });
const compiler = new FilterCompiler({ logger });

// With additional context
const requestLogger = logger.withContext({ requestId: '123-456' });
const compiler2 = new FilterCompiler({ logger: requestLogger });

Extending the Compiler

Create custom compilers for specific use cases:

import { FilterCompiler, FilterCompilerOptions, IConfiguration, WorkerCompiler } from '@jk-com/adblock-compiler';

// Custom compiler that always applies specific transformations
class ProductionCompiler extends FilterCompiler {
    constructor(options?: FilterCompilerOptions) {
        super(options);
    }

    async compile(configuration: IConfiguration): Promise<string[]> {
        // Ensure production transformations are always applied
        const productionConfig = {
            ...configuration,
            transformations: [
                ...(configuration.transformations || []),
                'Validate', // Always validate
                'Deduplicate', // Always deduplicate
                'RemoveEmptyLines', // Always remove empty lines
            ],
        };

        return super.compile(productionConfig);
    }
}

// Custom compiler with automatic caching
class CachedCompiler extends FilterCompiler {
    private cache: Map<string, { rules: string[]; timestamp: number }>;
    private ttl: number;

    constructor(options?: FilterCompilerOptions, ttlMs: number = 3600000) {
        super(options);
        this.cache = new Map();
        this.ttl = ttlMs;
    }

    async compile(configuration: IConfiguration): Promise<string[]> {
        const cacheKey = JSON.stringify(configuration);
        const cached = this.cache.get(cacheKey);

        if (cached && (Date.now() - cached.timestamp) < this.ttl) {
            console.log('Cache HIT');
            return cached.rules;
        }

        console.log('Cache MISS');
        const rules = await super.compile(configuration);

        this.cache.set(cacheKey, {
            rules,
            timestamp: Date.now(),
        });

        return rules;
    }

    clearCache(): void {
        this.cache.clear();
    }
}

// Usage
const prodCompiler = new ProductionCompiler();
const cachedCompiler = new CachedCompiler(undefined, 3600000); // 1 hour TTL

Plugin System

Create a plugin system for your application:

import { FilterCompiler, IContentFetcher, ILogger, Transformation } from '@jk-com/adblock-compiler';

interface Plugin {
    name: string;
    version: string;
    initialize(compiler: FilterCompiler): void | Promise<void>;
}

// Analytics plugin
class AnalyticsPlugin implements Plugin {
    name = 'analytics';
    version = '1.0.0';

    initialize(compiler: FilterCompiler): void {
        console.log(`Initialized ${this.name} plugin v${this.version}`);
        // Register custom event handlers, transformations, etc.
    }
}

// Monitoring plugin
class MonitoringPlugin implements Plugin {
    name = 'monitoring';
    version = '1.0.0';
    private endpoint: string;

    constructor(endpoint: string) {
        this.endpoint = endpoint;
    }

    async initialize(compiler: FilterCompiler): Promise<void> {
        console.log(`Initialized ${this.name} plugin v${this.version}`);
        // Set up monitoring hooks
    }
}

// Plugin manager
class PluginManager {
    private plugins: Plugin[] = [];

    register(plugin: Plugin): void {
        this.plugins.push(plugin);
    }

    async initializeAll(compiler: FilterCompiler): Promise<void> {
        for (const plugin of this.plugins) {
            await plugin.initialize(compiler);
        }
    }

    getPlugin(name: string): Plugin | undefined {
        return this.plugins.find((p) => p.name === name);
    }
}

// Usage
const pluginManager = new PluginManager();
pluginManager.register(new AnalyticsPlugin());
pluginManager.register(new MonitoringPlugin('https://metrics.example.com'));

const compiler = new FilterCompiler();
await pluginManager.initializeAll(compiler);

Best Practices

1. Follow Interface Contracts

Always implement the required interfaces fully:

// Good: Implements all required methods
class MyFetcher implements IContentFetcher {
    canHandle(source: string): Promise<boolean> {/* ... */}
    fetchContent(source: string): Promise<string> {/* ... */}
}

// Bad: Missing required methods
class BadFetcher implements IContentFetcher {
    canHandle(source: string): Promise<boolean> {/* ... */}
    // Missing fetchContent!
}

2. Handle Errors Gracefully

class RobustTransformation extends SyncTransformation {
    public executeSync(rules: string[]): string[] {
        try {
            return rules.map((rule) => this.transformRule(rule));
        } catch (error) {
            this.error(`Transformation failed: ${error.message}`);
            return rules; // Return original rules on error
        }
    }

    private transformRule(rule: string): string {
        // Your transformation logic
        return rule;
    }
}

3. Use Logging

class VerboseTransformation extends SyncTransformation {
    public executeSync(rules: string[]): string[] {
        this.info(`Starting transformation with ${rules.length} rules`);

        const result = this.doTransform(rules);

        this.info(`Transformation complete: ${rules.length} → ${result.length} rules`);
        return result;
    }
}

4. Document Your Extensions

/**
 * Removes rules that match a specific pattern.
 * Useful for filtering out unwanted rules from upstream sources.
 *
 * @example
 * ```typescript
 * const transformation = new PatternFilterTransformation(/google\\.com/);
 * const filtered = await transformation.execute(rules);
 * ```
 */
class PatternFilterTransformation extends SyncTransformation {
    // Implementation...
}

5. Test Your Extensions

import { assertEquals } from '@std/assert';

Deno.test('MyTransformation should remove duplicates', async () => {
    const transformation = new MyTransformation();
    const input = ['rule1', 'rule2', 'rule1'];
    const output = await transformation.execute(input);
    assertEquals(output, ['rule1', 'rule2']);
});

Example: Complete Custom Extension

Here's a complete example combining multiple extensibility features:

import { FilterCompiler, IContentFetcher, ILogger, SyncTransformation, TransformationRegistry, TransformationType } from '@jk-com/adblock-compiler';

// 1. Custom transformation
class RemoveSocialMediaTransformation extends SyncTransformation {
    public readonly type = 'RemoveSocialMedia' as TransformationType;
    public readonly name = 'Remove Social Media';

    private socialDomains = ['facebook.com', 'twitter.com', 'instagram.com'];

    public executeSync(rules: string[]): string[] {
        return rules.filter((rule) => {
            return !this.socialDomains.some((domain) => rule.includes(domain));
        });
    }
}

// 2. Custom fetcher
class S3Fetcher implements IContentFetcher {
    async canHandle(source: string): Promise<boolean> {
        return source.startsWith('s3://');
    }

    async fetchContent(source: string): Promise<string> {
        // Implement S3 fetching
        throw new Error('S3 fetcher not implemented');
    }
}

// 3. Custom logger
class FileLogger implements ILogger {
    private logFile: string;

    constructor(logFile: string) {
        this.logFile = logFile;
    }

    debug(message: string): void {
        this.write('DEBUG', message);
    }
    info(message: string): void {
        this.write('INFO', message);
    }
    warn(message: string): void {
        this.write('WARN', message);
    }
    error(message: string): void {
        this.write('ERROR', message);
    }

    private write(level: string, message: string): void {
        const entry = `[${new Date().toISOString()}] ${level}: ${message}\n`;
        Deno.writeTextFileSync(this.logFile, entry, { append: true });
    }
}

// 4. Put it all together
const logger = new FileLogger('./compiler.log');
const registry = new TransformationRegistry(logger);
registry.register('RemoveSocialMedia' as any, new RemoveSocialMediaTransformation(logger));

const compiler = new FilterCompiler({
    logger,
    transformationRegistry: registry,
});

// 5. Use it
const config = {
    name: 'My Custom Filter',
    sources: [{ source: 'https://example.com/filters.txt' }],
    transformations: ['RemoveSocialMedia', 'Deduplicate'],
};

const rules = await compiler.compile(config);
console.log(`Compiled ${rules.length} rules`);

Resources

Contributing

If you create useful extensions, consider contributing them back to the project!

Open a pull request at https://github.com/jaypatrick/adblock-compiler/pulls


Questions? Open an issue at https://github.com/jaypatrick/adblock-compiler/issues

Transformation Hooks

The transformation hooks system gives you fine-grained, per-transformation observability hooks that fire before, after, and on error for every transformation in the compilation pipeline.

Table of Contents


Overview

The adblock-compiler has two complementary observability layers:

LayerWhat it coversAsync?Error hooks?
ICompilerEventsCompiler-level events (sources, progress, completion)NoNo
TransformationHookManagerPer-transformation lifecycle (before/after/error)YesYes

The hooks system was always fully implemented in TransformationHooks.ts but was previously never wired into the pipeline. This guide documents the completed wiring and how to use both layers.


Architecture

FilterCompiler.compile(config)
  │
  ├─ emitCompilationStart         ← ICompilerEvents.onCompilationStart
  │
  ├─ SourceCompiler.compile()     ← ICompilerEvents.onSourceStart / onSourceComplete
  │
  └─ TransformationPipeline.transform()
       │
       └─ for each transformation:
            ├─ emitProgress                             ← ICompilerEvents.onProgress
            ├─ hookManager.executeBeforeHooks(ctx)      ← beforeTransform hooks
            │     └─ [bridge hook → emitTransformationStart]  ← ICompilerEvents.onTransformationStart
            ├─ transformation.execute(rules, ctx)
            ├─ hookManager.executeAfterHooks(ctx)       ← afterTransform hooks
            │     └─ [bridge hook → emitTransformationComplete] ← ICompilerEvents.onTransformationComplete
            └─ (on error) hookManager.executeErrorHooks(ctx)  ← onError hooks
                                                           then re-throw

The bridge between the two layers is createEventBridgeHook, which is automatically registered by FilterCompiler and WorkerCompiler when ICompilerEvents listeners are present.


Hook types

beforeTransform

Fires immediately before a transformation processes its input rules.

type BeforeTransformHook = (context: TransformationHookContext) => void | Promise<void>;

The context object contains:

FieldTypeDescription
namestringTransformation type string (e.g. "RemoveComments")
typeTransformationTypeEnum value for type-safe comparison
ruleCountnumberNumber of rules entering the transformation
timestampnumberDate.now() at hook call time
metadataRecord<string, unknown>?Optional free-form metadata

afterTransform

Fires immediately after a transformation completes successfully.

type AfterTransformHook = (
  context: TransformationHookContext & {
    inputCount: number;
    outputCount: number;
    durationMs: number;
  }
) => void | Promise<void>;

The extended context adds:

FieldTypeDescription
inputCountnumberRule count entering the transformation
outputCountnumberRule count exiting the transformation
durationMsnumberWall-clock execution time in milliseconds

onError

Fires when a transformation throws an unhandled error.

type TransformErrorHook = (
  context: TransformationHookContext & { error: Error }
) => void | Promise<void>;

Important: Error hooks are observers only. They cannot suppress or replace the error. After all registered error hooks have been awaited the pipeline re-throws the original error unchanged.


TransformationHookManager

TransformationHookManager holds the registered hooks and exposes the fluent on* API for registering them.

Constructing with a config object

import { TransformationHookManager } from '@jk-com/adblock-compiler';

const manager = new TransformationHookManager({
  beforeTransform: [
    (ctx) => console.log(`▶ ${ctx.name} — ${ctx.ruleCount} rules`),
  ],
  afterTransform: [
    (ctx) => console.log(`✔ ${ctx.name} — ${ctx.durationMs.toFixed(2)}ms`),
  ],
  onError: [
    (ctx) => console.error(`✖ ${ctx.name}`, ctx.error),
  ],
});

Fluent registration

const manager = new TransformationHookManager()
  .onBeforeTransform((ctx) => console.log(`▶ ${ctx.name}`))
  .onAfterTransform((ctx) => console.log(`✔ ${ctx.name} — ${ctx.durationMs.toFixed(2)}ms`))
  .onTransformError((ctx) => console.error(`✖ ${ctx.name}`, ctx.error));

Async hooks

Hooks can return a Promise. The pipeline awaits each hook before proceeding:

manager.onAfterTransform(async (ctx) => {
  // Safely awaited — the pipeline waits for this before the next transformation
  await fetch('https://metrics.example.com/record', {
    method: 'POST',
    body: JSON.stringify({ name: ctx.name, durationMs: ctx.durationMs }),
  });
});

Using hooks with FilterCompiler

Pass a hookManager in FilterCompilerOptions:

import {
  FilterCompiler,
  TransformationHookManager,
  createLoggingHook,
} from '@jk-com/adblock-compiler';

const hookManager = new TransformationHookManager(createLoggingHook(console));

const compiler = new FilterCompiler({
  hookManager,
  events: {
    onCompilationComplete: (e) => console.log(`Done in ${e.totalDurationMs}ms`),
  },
});

await compiler.compile(config);
// → [Transform] Starting RemoveComments with 4123 rules
// → [Transform] Completed RemoveComments: 4123 → 3891 rules (-232) in 1.40ms
// → Done in 847ms

Hook manager resolution rules

FilterCompiler resolves the internal hook manager in the following order:

ConditionResult
hookManager provided, transformation events registeredInternal composed manager: bridge hook + delegate to user's manager
hookManager provided, no transformation eventsInternal composed manager: delegate to user's manager only
No hookManager, onTransformationStart/Complete registeredBridge-only manager
NeitherNoOpHookManager (zero overhead)

Important: FilterCompiler never mutates the caller's hookManager instance. An internal composed manager is always created, so the same hookManager can safely be shared across multiple FilterCompiler instances. This also means that passing a NoOpHookManager as hookManager works correctly — user hooks are skipped, but the bridge fires if transformation events are registered.

Targeted listener check: the bridge hook is installed only when onTransformationStart or onTransformationComplete is registered. Providing other listeners such as onProgress alone does not cause hook overhead on every transformation.


Built-in hook factories

createLoggingHook

Logs transformation start, completion, and errors to any { info, error } logger.

import { createLoggingHook, TransformationHookManager } from '@jk-com/adblock-compiler';

const manager = new TransformationHookManager(createLoggingHook(myLogger));

Output format:

[Transform] Starting RemoveComments with 4123 rules
[Transform] Completed RemoveComments: 4123 → 3891 rules (-232) in 1.40ms
[Transform] Error in Deduplicate: out of memory

createMetricsHook

Records per-transformation timing and rule-count diff to a custom collector.

import { createMetricsHook, TransformationHookManager } from '@jk-com/adblock-compiler';

const timings: Record<string, number> = {};
const manager = new TransformationHookManager(
  createMetricsHook({
    record: (name, durationMs, rulesDiff) => {
      timings[name] = durationMs;
      console.log(`${name}: ${durationMs.toFixed(2)}ms, ${rulesDiff >= 0 ? '-' : '+'}${Math.abs(rulesDiff)} rules`);
    },
  }),
);

Wire collector.record to Prometheus, StatsD, OpenTelemetry, or any custom metrics sink.

createEventBridgeHook

Bridges the hook system into the ICompilerEvents event bus. This is used automatically by FilterCompiler and WorkerCompiler — you do not normally need to call it directly.

It is useful if you are constructing TransformationPipeline manually and want ICompilerEvents.onTransformationStart / onTransformationComplete to still fire:

import {
  createEventBridgeHook,
  CompilerEventEmitter,
  TransformationHookManager,
  TransformationPipeline,
} from '@jk-com/adblock-compiler';

const eventEmitter = new CompilerEventEmitter({ onTransformationStart: (e) => console.log(e) });
const hookManager = new TransformationHookManager(createEventBridgeHook(eventEmitter));
const pipeline = new TransformationPipeline(undefined, logger, eventEmitter, hookManager);

Relationship to ICompilerEvents

ICompilerEvents.onTransformationStart and onTransformationComplete were previously fired by direct calls inside the TransformationPipeline loop. Those calls were removed when the hook system was wired in. The bridge hook re-implements that forwarding inside the hook system:

before hook fires → bridge hook → emitTransformationStart → onTransformationStart
after hook fires  → bridge hook → emitTransformationComplete → onTransformationComplete

Auto-wiring in TransformationPipeline

TransformationPipeline itself auto-wires the bridge hook in its constructor when an eventEmitter with transformation listeners is passed but no hookManager is provided:

// TransformationPipeline auto-detects this and wires the bridge:
new TransformationPipeline(undefined, logger, eventEmitterWithTransformListeners)
//                                            ↑ has onTransformationStart/Complete

This covers call sites like SourceCompiler that construct the pipeline without knowing about the hook system — they only pass an eventEmitter.

Targeted listener check

Both FilterCompiler, WorkerCompiler, and TransformationPipeline check specifically for onTransformationStart / onTransformationComplete rather than the general hasListeners() before installing a bridge hook. This means registering only onProgress or onCompilationComplete does not cause any hook execution overhead per transformation.

This means existing code that uses ICompilerEvents continues to work with no changes.


onCompilationStart event

A new onCompilationStart event was added to ICompilerEvents to complete the compiler lifecycle:

const compiler = new FilterCompiler({
  events: {
    onCompilationStart: (e) => {
      console.log(
        `Compiling "${e.configName}": ` +
        `${e.sourceCount} sources, ${e.transformationCount} transformations`
      );
    },
    onCompilationComplete: (e) => {
      console.log(`Completed in ${e.totalDurationMs}ms, ${e.ruleCount} output rules`);
    },
  },
});

The ICompilationStartEvent shape:

FieldTypeDescription
configNamestringIConfiguration.name
sourceCountnumberNumber of sources to be compiled
transformationCountnumberNumber of global transformations configured
timestampnumberDate.now() at emission time

The event fires after validation passes but before any source is fetched. This guarantees that sourceCount and transformationCount are correct (the configuration has been validated at this point).


NoOpHookManager

NoOpHookManager is the zero-cost default used when no hooks are registered. All three execute* methods are empty overrides and hasHooks() always returns false, so the pipeline's guard:

if (this.hookManager.hasHooks()) {
  await this.hookManager.executeBeforeHooks(context);
}

short-circuits immediately with no virtual dispatch overhead.

You never need to construct NoOpHookManager directly. It is the automatic default in:

  • new TransformationPipeline() (no hookManager arg)
  • new FilterCompiler() (no hookManager in options)
  • new FilterCompiler(logger) (legacy constructor)

Advanced: combining hooks and events

You can use both hookManager and events together. FilterCompiler automatically detects this combination and appends the bridge hook so both systems fire without double-registration:

import {
  FilterCompiler,
  TransformationHookManager,
  createMetricsHook,
} from '@jk-com/adblock-compiler';

const timings: Record<string, number> = {};

const compiler = new FilterCompiler({
  // Compiler-level events (fires at source and compilation boundaries)
  events: {
    onCompilationStart: (e) => console.log(`Starting: ${e.configName}`),
    onTransformationStart: (e) => console.log(`→ ${e.name}`),   // still fires via bridge
    onTransformationComplete: (e) => console.log(`← ${e.name}`), // still fires via bridge
    onCompilationComplete: (e) => console.log(`Done: ${e.totalDurationMs}ms`),
  },
  // Per-transformation hooks (async, with error hooks)
  hookManager: new TransformationHookManager(
    createMetricsHook({ record: (name, ms) => { timings[name] = ms; } }),
  ),
});

await compiler.compile(config);

Design decisions

Why hooks instead of modifying the Transformation base class?

Adding observability points to the Transformation base class would require every transformation to call super.beforeExecute() / super.afterExecute(), which ties the observability concern to the transformation's inheritance chain. External hooks are opt-in decorators — they attach to the pipeline, not to individual transformations, and work uniformly across all transformation types including third-party ones.

Why TransformationHookManager instead of bare callbacks?

A dedicated manager class keeps the TransformationPipeline's interface clean (three well-typed methods: executeBeforeHooks, executeAfterHooks, executeErrorHooks), while the manager handles ordering, registration, and the hasHooks() fast path. The pipeline has no knowledge of how many hooks are registered or how to call them.

Why the hasHooks() fast-path guard?

Without the guard, the pipeline would construct a context object, call executeBeforeHooks, and await it on every iteration — even when there are no hooks and every method is a no-op. The guard ensures the hot path (no hooks registered) has exactly zero overhead beyond a false boolean check. NoOpHookManager.hasHooks() is always false, so the guard always short-circuits for the default case.

Why fire onCompilationStart after validation?

Firing before validation would mean sourceCount and transformationCount could be undefined or wrong (the configuration hasn't been validated yet). Firing after validation guarantees that when onCompilationStart arrives at your handler, the numbers are accurate and the compilation will proceed — only fetch/download errors can still fail at that point.

Both FilterCompiler and WorkerCompiler fire this event at the equivalent point (after their respective validation passes), keeping the ICompilerEvents lifecycle consistent across both compiler implementations.

Why compose an internal manager instead of mutating the caller's hookManager?

The original code appended bridge hooks directly to the caller-supplied hookManager. This caused two problems:

  1. Duplicate events on reuse: if the same hookManager instance was passed to multiple FilterCompiler instances, each one would append another set of bridge hooks, causing onTransformationStart/Complete to fire multiple times per transformation.
  2. Broken for NoOpHookManager: NoOpHookManager.hasHooks() always returns false, so any hooks appended to it would never execute in the pipeline.

The fix: always compose a fresh internal manager. The bridge hook (if needed) and a delegation wrapper (if the user's manager has hooks) are both registered on the new internal manager, which is then passed to the pipeline. The caller's instance is never touched.

Why check only for transformation-specific listeners?

hasListeners() returns true if any ICompilerEvents handler is registered — including onProgress, onCompilationComplete, etc. Installing the bridge hook whenever any event is registered would add await overhead on every transformation iteration even when onTransformationStart/Complete are not subscribed.

The fix: check options?.events?.onTransformationStart || onTransformationComplete directly. Only when one of these two is present does a bridge hook get installed.

Why does createEventBridgeHook exist?

Before the hooks system was wired in, TransformationPipeline called eventEmitter.emitTransformationStart / emitTransformationComplete directly in the loop. When those calls were removed (to route everything through hooks), existing callers using ICompilerEvents.onTransformationStart / onTransformationComplete would have stopped receiving events. The bridge hook re-implements exactly that forwarding inside the hook system, maintaining full backward compatibility.

count-loc.sh — Lines of Code Counter

Location: scripts/count-loc.sh Added: 2026-03-08 Shell: zsh (no external dependencies — standard POSIX tools only)


Overview

count-loc.sh is a zero-dependency shell script that counts lines of code across the entire repository, broken down by language. It is designed to run quickly against a local clone without requiring any third-party tools such as tokei or cloc.

It lives in scripts/ alongside the other TypeScript utility scripts (sync-version.ts, generate-docs.ts, etc.) and follows the same convention of being run from the repository root.


Usage

# Make executable once
chmod +x scripts/count-loc.sh

# Full language breakdown (default)
./scripts/count-loc.sh

# Exclude lock files, *.d.ts, and minified files
./scripts/count-loc.sh --no-vendor

# Print only the grand total — useful for CI badges or scripting
./scripts/count-loc.sh --total

# Help
./scripts/count-loc.sh --help

Options

FlagDescription
(none)Count all recognised source files; print a per-language table
--no-vendorAdditionally exclude lock files and generated/minified artefacts
--totalPrint only the integer grand total and exit
--help / -hPrint usage and exit

Sample Output

Language                           Lines   Share
------------------------------ ----------  ------
TypeScript                          14823   71.2%
Markdown                             3201   15.4%
YAML                                  892    4.3%
JSON                                  741    3.6%
Shell                                 312    1.5%
CSS                                   289    1.4%
HTML                                  201    1.0%
TOML                                  198    1.0%
Python                                155    0.7%
------------------------------ ----------  ------
TOTAL                               20812  100%

How It Works

1. Repo-root resolution

The script uses zsh's ${0:A:h} (absolute path of the script's directory) and navigates one level up to find the repo root, so it works correctly regardless of where it is invoked from:

SCRIPT_DIR="${0:A:h}"       # → /path/to/repo/scripts
REPO_ROOT="${SCRIPT_DIR:h}" # → /path/to/repo
cd "$REPO_ROOT"

2. Directory pruning

find prune expressions are built dynamically from PRUNE_DIRS to skip noisy directories in a single traversal pass:

node_modules  .git  dist  build  .wrangler
output  coverage  .turbo  .next  .angular

3. Language detection

Files are matched by extension using an associative array (typeset -A EXT_LANG). Dockerfiles (no extension) are matched by name pattern instead.

Recognised extensions:

Extension(s)Language
.tsTypeScript
.tsxTypeScript (TSX)
.jsJavaScript
.mjs / .cjsJavaScript (ESM / CJS)
.cssCSS
.scssSCSS
.htmlHTML
.pyPython
.sh / .zshShell / Zsh
.tomlTOML
.yaml / .ymlYAML
.jsonJSON
.mdMarkdown
.sqlSQL
Dockerfile*Dockerfile

4. Vendor filtering (--no-vendor)

When --no-vendor is passed, files matching the following patterns are excluded via grep -v after collection:

pnpm-lock.yaml   package-lock.json   deno.lock   yarn.lock
*.min.js         *.min.css           *.generated.ts   *.d.ts

5. Line counting

Lines are counted with xargs wc -l, which is the fastest approach on macOS and Linux for large file sets. The total is extracted from wc's own summary line and accumulated per language.


What Is and Is Not Counted

Always counted (default mode)

  • All source files matching the recognised extensions above
  • Lock files (pnpm-lock.yaml, deno.lock, etc.)
  • TypeScript declaration files (*.d.ts)
  • Minified files

Excluded by default

  • node_modules/
  • .git/
  • dist/, build/, output/
  • .wrangler/, .angular/, .turbo/, .next/
  • coverage/

Additionally excluded with --no-vendor

  • pnpm-lock.yaml, package-lock.json, deno.lock, yarn.lock
  • *.d.ts
  • *.min.js, *.min.css
  • *.generated.ts

Note: The script counts all lines (including blank lines and comments). It does not perform semantic filtering. For blank/comment-stripped counts, use tokei or cloc (see Alternatives below).


Integration

CI / GitHub Actions

Use --total to surface the line count as a step output or log annotation:

- name: Count lines of code
  run: |
    chmod +x scripts/count-loc.sh
    LOC=$(./scripts/count-loc.sh --total)
    echo "Total LOC: $LOC"
    echo "loc=$LOC" >> "$GITHUB_OUTPUT"

Pre-commit hook

# .git/hooks/pre-commit
#!/usr/bin/env zsh
echo "Repository LOC:"
./scripts/count-loc.sh --no-vendor

Alternatives

For richer output (blank lines, comment lines, source lines broken out separately), install one of these popular tools:

# tokei — fastest, Rust-based
brew install tokei
tokei .

# cloc — Perl-based, very detailed
brew install cloc
cloc --exclude-dir=node_modules,.git .

Both are referenced in a comment at the bottom of count-loc.sh as a reminder.


Frontend Documentation

Documentation for the Adblock Compiler frontend applications and UI components.

Contents

  • Angular Frontend - Angular 21 SPA with zoneless change detection, Material Design 3, and SSR
  • SPA Benefits Analysis - Analysis of SPA benefits and migration recommendations
  • Tailwind CSS - Utility-first CSS framework integration with PostCSS
  • Validation UI - Color-coded validation error UI component
  • Vite Integration - Frontend build pipeline with HMR, multi-page app, and React/Vue support

Angular Frontend — Developer Reference

Audience: Contributors and integrators working on the Angular frontend. Location: frontend/ directory of the adblock-compiler monorepo. Status: Production-ready reference implementation — Angular 21, zoneless, SSR, Cloudflare Workers.


Table of Contents

  1. Overview
  2. Quick Start
  3. Architecture Overview
  4. Project Structure
  5. Technology Stack
  6. Angular 21 API Patterns
  7. Component Catalog
  8. Services Catalog
  9. State Management
  10. Routing
  11. SSR and Rendering Modes
  12. Accessibility (WCAG 2.1)
  13. Security
  14. Testing
  15. Cloudflare Workers Deployment
  16. Configuration Tokens
  17. Extending the Frontend
  18. Migration Reference (v16 → v21)
  19. Further Reading

Overview

The frontend/ directory contains a complete Angular 21 application that serves as the production UI for the Adblock Compiler API. It is designed as a showcase of every major modern Angular API, covering:

  • Zoneless change detection (no zone.js)
  • Signal-first state and component API
  • Server-Side Rendering (SSR) on Cloudflare Workers
  • Angular Material 3 design system
  • PWA / Service Worker support
  • End-to-end Playwright tests
  • Vitest unit tests with @analogjs/vitest-angular

The application connects to the Cloudflare Worker API (/api/*) and provides six pages: Home, Compiler, Performance, Validation, API Docs, and Admin.


Quick Start

# 1. Install dependencies
cd frontend
npm install

# 2. Start the CSR dev server (fastest iteration)
npm start              # → http://localhost:4200

# 3. Build SSR bundle
npm run build

# 4. Preview with Wrangler (mirrors Cloudflare Workers production)
npm run preview        # → http://localhost:8787

# 5. Deploy to Cloudflare Workers
deno task wrangler:deploy

# 6. Run unit tests (Vitest)
npm test               # single pass
npm run test:watch     # watch mode
npm run test:coverage  # V8 coverage report in coverage/

# 7. Run E2E tests (Playwright — requires dev server running)
npx playwright test

Architecture Overview

graph TD
    subgraph Browser["Browser / CDN Edge"]
        NG["Angular SPA<br/>Angular 21 · Zoneless · Material 3"]
        SW["Service Worker<br/>@angular/service-worker"]
    end

    subgraph CFW["Cloudflare Worker (SSR)"]
        AE["AngularAppEngine<br/>fetch handler · CSP headers"]
        ASSETS["Static Assets<br/>ASSETS binding · CDN"]
    end

    subgraph API["Adblock Compiler API"]
        COMPILE["/api/compile<br/>POST — SSE stream"]
        METRICS["/api/metrics<br/>GET — performance stats"]
        HEALTH["/api/health<br/>GET — liveness check"]
        VALIDATE["/api/validate<br/>POST — rule validation"]
        STORAGE["/api/storage/*<br/>Admin R2 — D1 endpoints"]
    end

    Browser -->|HTML request| CFW
    AE -->|SSR HTML| Browser
    ASSETS -->|JS/CSS/fonts| Browser
    SW -->|Cache first| Browser
    NG -->|REST / SSE| API

Data Flow for a Compilation Request

sequenceDiagram
    actor User
    participant CC as CompilerComponent
    participant TS as TurnstileService
    participant SSE as SseService
    participant API as /api/compile/stream

    User->>CC: Fills form, clicks Compile
    CC->>TS: turnstileToken() — bot check
    TS-->>CC: token (or empty if disabled)
    CC->>SSE: connect('/compile/stream', body)
    SSE->>API: POST (fetch + ReadableStream)
    API-->>SSE: SSE events (progress, result, done)
    SSE-->>CC: events() signal updated
    CC-->>User: Renders log lines via CDK Virtual Scroll

Project Structure

frontend/
├── src/
│   ├── app/
│   │   ├── app.component.ts            # Root shell: sidenav, toolbar, theme toggle
│   │   ├── app.config.ts               # Browser providers: zoneless, router, HTTP, SSR hydration
│   │   ├── app.config.server.ts        # SSR providers: mergeApplicationConfig(), absolute API URL
│   │   ├── app.routes.ts               # Lazy-loaded routes with titles + route data
│   │   ├── app.routes.server.ts        # Per-route render mode (Server / Prerender / Client)
│   │   ├── tokens.ts                   # InjectionToken declarations (API_BASE_URL, TURNSTILE_SITE_KEY)
│   │   ├── route-animations.ts         # Angular Animations trigger for route transitions
│   │   │
│   │   ├── compiler/
│   │   │   └── compiler.component.ts   # rxResource(), linkedSignal(), SSE streaming, Turnstile, CDK Virtual Scroll
│   │   ├── home/
│   │   │   └── home.component.ts       # MetricsStore, @defer on viewport, skeleton loading
│   │   ├── performance/
│   │   │   └── performance.component.ts  # httpResource(), MetricsStore, SparklineComponent
│   │   ├── admin/
│   │   │   └── admin.component.ts      # Auth guard, rxResource(), CDK Virtual Scroll, SQL console
│   │   ├── api-docs/
│   │   │   └── api-docs.component.ts   # httpResource() for /api/version endpoint
│   │   ├── validation/
│   │   │   └── validation.component.ts # Rule validation, color-coded output
│   │   │
│   │   ├── error/
│   │   │   ├── global-error-handler.ts         # Custom ErrorHandler with signal state
│   │   │   └── error-boundary.component.ts     # Dismissible error overlay
│   │   ├── guards/
│   │   │   └── admin.guard.ts          # Functional CanActivateFn for admin route
│   │   ├── interceptors/
│   │   │   └── error.interceptor.ts    # Functional HttpInterceptorFn (401, 429, 5xx)
│   │   ├── skeleton/
│   │   │   ├── skeleton-card.component.ts      # mat-card (outlined) + mat-progress-bar buffer + shimmer card placeholder
│   │   │   └── skeleton-table.component.ts     # mat-card (outlined) + mat-progress-bar buffer + shimmer table placeholder
│   │   ├── sparkline/
│   │   │   └── sparkline.component.ts  # mat-card (outlined) wrapper, Canvas 2D mini chart (zero dependencies)
│   │   ├── stat-card/
│   │   │   ├── stat-card.component.ts  # input() / output() / model() demo component
│   │   │   └── stat-card.component.spec.ts
│   │   ├── store/
│   │   │   └── metrics.store.ts        # Shared singleton signal store with SWR cache
│   │   ├── turnstile/
│   │   │   └── turnstile.component.ts  # mat-card (outlined) wrapper, Cloudflare Turnstile CAPTCHA widget
│   │   ├── services/
│   │   │   ├── auth.service.ts         # Admin key management (sessionStorage)
│   │   │   ├── compiler.service.ts     # POST /api/compile — Observable HTTP
│   │   │   ├── filter-parser.service.ts  # Web Worker bridge for off-thread parsing
│   │   │   ├── metrics.service.ts      # GET /api/metrics, /api/health
│   │   │   ├── sse.service.ts          # Generic fetch-based SSE client returning signals
│   │   │   ├── storage.service.ts      # Admin R2/D1 storage endpoints
│   │   │   ├── swr-cache.service.ts    # Generic stale-while-revalidate signal cache
│   │   │   ├── theme.service.ts        # Dark/light theme signal state, SSR-safe
│   │   │   ├── turnstile.service.ts    # Turnstile widget lifecycle + token signal
│   │   │   └── validation.service.ts   # POST /api/validate
│   │   └── workers/
│   │       └── filter-parser.worker.ts # Off-thread Web Worker: filter list parsing
│   │
│   ├── e2e/                            # Playwright E2E tests
│   │   ├── playwright.config.ts
│   │   ├── home.spec.ts
│   │   ├── compiler.spec.ts
│   │   └── navigation.spec.ts
│   ├── index.html                      # App shell: Turnstile script tag, npm fonts
│   ├── main.ts                         # bootstrapApplication()
│   ├── main.server.ts                  # Server bootstrap (imported by server.ts)
│   ├── styles.css                      # @fontsource/roboto + material-symbols imports
│   └── test-setup.ts                   # Vitest global setup: imports @angular/compiler
│
├── server.ts                           # Cloudflare Workers fetch handler + CSP headers
├── ngsw-config.json                    # PWA / Service Worker cache config
├── angular.json                        # Angular CLI workspace configuration
├── vitest.config.ts                    # Vitest + @analogjs/vitest-angular configuration
├── wrangler.toml                       # Cloudflare Workers deployment configuration
├── tsconfig.json                       # Base TypeScript config
├── tsconfig.app.json                   # App-specific TS config
└── tsconfig.spec.json                  # Spec-specific TS config (vitest/globals types)

Technology Stack

TechnologyVersionRole
Angular^21.0.0Application framework
Angular Material^21.0.0Material Design 3 component library
@angular/ssr^21.0.0Server-Side Rendering (edge-fetch adapter)
@angular/cdk^21.0.0Layout, virtual scrolling, accessibility (a11y) utilities
@angular/service-worker^21.0.0PWA / Service Worker support
RxJS~7.8.2Async streams for HTTP and route params
TypeScript~5.8.0Type safety throughout
Cloudflare WorkersEdge SSR deployment platform
WranglerCloudflare Workers CLI (deploy + local dev)
Vitest^3.0.0Fast unit test runner (replaces Karma)
@analogjs/vitest-angular^1.0.0Angular compiler plugin for Vitest
TailwindCSS^4.xUtility-first CSS; bridged to Angular Material M3 tokens via @theme inline
PlaywrightE2E browser test framework
@fontsource/roboto^5.xRoboto font — npm package, no CDN dependency
material-symbols^0.31.0Material Symbols icon font — npm package, no CDN

Angular 21 API Patterns

This section documents every modern Angular API demonstrated in the frontend, with annotated code samples drawn directly from the source.


1. signal() / computed() / effect()

The foundation of Angular's reactive model. All mutable component state uses signal(). Derived values use computed(). Side-effects use effect().

import { signal, computed, effect } from '@angular/core';

// Writable signal
readonly compilationCount = signal(0);

// Computed signal — automatically re-derives when compilationCount changes
readonly doubleCount = computed(() => this.compilationCount() * 2);

constructor() {
    // effect() runs once immediately, then again whenever any read signal changes
    effect(() => {
        console.log('Count:', this.compilationCount());
    });
}

// Mutate with .set() or .update()
this.compilationCount.set(5);
this.compilationCount.update(n => n + 1);

Template binding:

<p>Count: {{ compilationCount() }}</p>
<p>Double: {{ doubleCount() }}</p>
<button (click)="compilationCount.update(n => n + 1)">Increment</button>

See: services/theme.service.ts, store/metrics.store.ts


2. input() / output() / model()

Replaces @Input(), @Output() + EventEmitter, and the @Input()/@Output() pair for two-way binding.

import { input, output, model } from '@angular/core';

@Component({ selector: 'app-stat-card', standalone: true, /* … */ })
export class StatCardComponent {
    // input.required() — compile error if parent omits this binding
    readonly label = input.required<string>();

    // input() with default value
    readonly color = input<string>('#1976d2');

    // output() — replaces @Output() clicked = new EventEmitter<string>()
    readonly cardClicked = output<string>();

    // model() — two-way writable signal (replaces @Input()/@Output() pair)
    // Parent uses [(highlighted)]="isHighlighted"
    readonly highlighted = model<boolean>(false);

    click(): void {
        this.cardClicked.emit(this.label());
        this.highlighted.update(h => !h);   // write back to parent via model()
    }
}

Parent template:

<app-stat-card
    label="Filter Lists"
    color="primary"
    [(highlighted)]="isHighlighted"
    (cardClicked)="onCardClick($event)"
/>

See: stat-card/stat-card.component.ts


3. viewChild() / viewChildren()

Replaces @ViewChild / @ViewChildren decorators. Returns Signal<T | undefined> — no AfterViewInit hook needed.

import { viewChild, viewChildren, ElementRef } from '@angular/core';
import { MatSidenav } from '@angular/material/sidenav';

@Component({ /* … */ })
export class AppComponent {
    // Replaces: @ViewChild('sidenav') sidenav!: MatSidenav;
    readonly sidenavRef = viewChild<MatSidenav>('sidenav');

    // Read the signal like any other — resolves after view initialises
    openSidenav(): void {
        this.sidenavRef()?.open();
    }
}

See: app.component.ts, home/home.component.ts


4. @defer — Deferrable Views

Lazily loads and renders a template block when a trigger fires. Enables incremental hydration in SSR: the placeholder HTML ships in the initial payload and the heavy component chunk hydrates progressively.

<!-- Load when the block enters the viewport -->
@defer (on viewport; prefetch on hover) {
    <app-feature-highlights />
} @placeholder (minimum 200ms) {
    <app-skeleton-card lines="3" />
} @loading (minimum 300ms; after 100ms) {
    <mat-spinner diameter="32" />
} @error {
    <p>Failed to load</p>
}

<!-- Load when the browser is idle -->
@defer (on idle) {
    <app-summary-stats />
} @placeholder {
    <mat-spinner diameter="24" />
}

Available triggers:

TriggerWhen it fires
on viewportBlock enters the viewport (IntersectionObserver)
on idlerequestIdleCallback fires
on interactionFirst click or focus inside the placeholder
on timer(n)After n milliseconds
when (expr)When a signal/boolean becomes truthy
prefetch on hoverPre-fetches the chunk on hover but delays render

See: home/home.component.ts


5. rxResource() / httpResource()

rxResource() (from @angular/core/rxjs-interop) — replaces the loading / error / result signal trio and manual subscribe/unsubscribe boilerplate. The loader returns an Observable.

import { rxResource } from '@angular/core/rxjs-interop';

@Component({ /* … */ })
export class CompilerComponent {
    // pendingRequest drives the resource — undefined keeps it Idle
    private readonly pendingRequest = signal<CompileRequest | undefined>(undefined);

    readonly compileResource = rxResource<CompileResponse, CompileRequest | undefined>({
        request: () => this.pendingRequest(),
        loader: ({ request }) => this.compilerService.compile(
            request.urls,
            request.transformations,
        ),
    });

    submit(): void {
        this.pendingRequest.set({ urls: ['https://…'], transformations: ['Deduplicate'] });
    }
}

Template:

@if (compileResource.isLoading()) {
    <mat-spinner />
} @else if (compileResource.value(); as result) {
    <pre>{{ result | json }}</pre>
} @else if (compileResource.error(); as err) {
    <p class="error">{{ err }}</p>
}

httpResource() (Angular 21, from @angular/common/http) — declarative HTTP fetching that wires directly to a URL signal. No service needed for simple GET requests.

import { httpResource } from '@angular/common/http';

@Component({ /* … */ })
export class ApiDocsComponent {
    readonly versionResource = httpResource<{ version: string }>('/api/version');

    // In template:
    // versionResource.value()?.version
    // versionResource.isLoading()
    // versionResource.error()
}

See: compiler/compiler.component.ts, api-docs/api-docs.component.ts, performance/performance.component.ts


6. linkedSignal()

A writable signal whose value automatically resets when a source signal changes, but can be overridden manually between resets. Useful for preset-driven form defaults that the user can still customise.

import { signal, linkedSignal } from '@angular/core';

readonly selectedPreset = signal<string>('EasyList');
readonly presets = [
    { label: 'EasyList',   urls: ['https://easylist.to/easylist/easylist.txt'] },
    { label: 'AdGuard DNS', urls: ['https://adguardteam.github.io/…'] },
];

// Resets to preset URLs when selectedPreset changes
// but the user can still edit them manually
readonly presetUrls = linkedSignal(() => {
    const preset = this.presets.find(p => p.label === this.selectedPreset());
    return preset?.urls ?? [''];
});

// User can override without triggering a reset:
this.presetUrls.set(['https://my-custom-list.txt']);

// Switching preset resets back to preset defaults:
this.selectedPreset.set('AdGuard DNS');
// presetUrls() is now ['https://adguardteam.github.io/…']

See: compiler/compiler.component.ts


7. afterRenderEffect()

The correct API for reading or writing the DOM after Angular commits a render. Unlike effect() in the constructor, this is guaranteed to run after layout is complete.

import { viewChild, signal, afterRenderEffect, ElementRef } from '@angular/core';

@Component({ /* … */ })
export class BenchmarkComponent {
    readonly tableHeight = signal(0);
    readonly benchmarkTableRef = viewChild<ElementRef>('benchmarkTable');

    constructor() {
        afterRenderEffect(() => {
            const el = this.benchmarkTableRef()?.nativeElement as HTMLElement | undefined;
            if (el) {
                // Safe: DOM is fully committed at this point
                this.tableHeight.set(el.offsetHeight);
            }
        });
    }
}

Use cases: chart integrations, scroll position restore, focus management, third-party DOM libraries, canvas sizing.


8. provideAppInitializer()

Replaces the verbose APP_INITIALIZER injection token + factory function. Available and stable since Angular v19.

import { provideAppInitializer, inject } from '@angular/core';
import { ThemeService } from './services/theme.service';

// OLD pattern (still works but verbose):
{
    provide: APP_INITIALIZER,
    useFactory: (theme: ThemeService) => () => theme.loadPreferences(),
    deps: [ThemeService],
    multi: true,
}

// NEW pattern — no deps array, inject() works directly:
provideAppInitializer(() => {
    inject(ThemeService).loadPreferences();
})

The callback runs synchronously before the first render. Return a Promise or Observable to block rendering until async initialisation completes. Used here to apply the saved theme class to <body> before the first paint, preventing theme flash on load.

See: app.config.ts, services/theme.service.ts


9. toSignal() / takeUntilDestroyed()

Both helpers come from @angular/core/rxjs-interop and bridge RxJS Observables with the Signals world.

toSignal() — converts any Observable to a Signal. Auto-unsubscribes when the component is destroyed.

import { toSignal } from '@angular/core/rxjs-interop';
import { BreakpointObserver, Breakpoints } from '@angular/cdk/layout';
import { map } from 'rxjs/operators';

@Component({ /* … */ })
export class AppComponent {
    private readonly breakpointObserver = inject(BreakpointObserver);

    // Observable → Signal; initialValue prevents undefined on first render
    readonly isMobile = toSignal(
        this.breakpointObserver.observe([Breakpoints.Handset])
            .pipe(map(result => result.matches)),
        { initialValue: false },
    );
}

takeUntilDestroyed() — replaces the Subject<void> + ngOnDestroy teardown pattern.

import { takeUntilDestroyed } from '@angular/core/rxjs-interop';
import { DestroyRef, inject } from '@angular/core';

@Component({ /* … */ })
export class CompilerComponent {
    private readonly destroyRef = inject(DestroyRef);

    ngOnInit(): void {
        this.route.queryParamMap
            .pipe(takeUntilDestroyed(this.destroyRef))
            .subscribe(params => {
                // Handles unsubscription automatically on destroy
            });
    }
}

See: app.component.ts, compiler/compiler.component.ts


10. @if / @for / @switch

Angular 17+ built-in control flow. Replaces *ngIf, *ngFor, and *ngSwitch structural directives. No NgIf, NgFor, or NgSwitch import needed.

<!-- @if with else-if chain -->
@if (compileResource.isLoading()) {
    <mat-spinner />
} @else if (compileResource.value(); as result) {
    <pre>{{ result | json }}</pre>
} @else {
    <p>No results yet.</p>
}

<!-- @for with empty block — track is required -->
@for (item of runs(); track item.run) {
    <tr>
        <td>{{ item.run }}</td>
        <td>{{ item.duration }}</td>
    </tr>
} @empty {
    <tr><td colspan="2">No runs yet</td></tr>
}

<!-- @switch -->
@switch (status()) {
    @case ('loading')  { <mat-spinner /> }
    @case ('error')    { <p class="error">Error</p> }
    @default           { <p>Idle</p> }
}

11. inject()

Functional Dependency Injection — replaces constructor parameter injection. Works in components, services, directives, pipes, and provideAppInitializer() callbacks.

import { inject } from '@angular/core';
import { HttpClient } from '@angular/common/http';
import { Router } from '@angular/router';

@Injectable({ providedIn: 'root' })
export class CompilerService {
    // No constructor() needed for DI
    private readonly http   = inject(HttpClient);
    private readonly router = inject(Router);
}

See: Every service and component in the frontend.


12. Zoneless Change Detection

Enabled in app.config.ts via provideZonelessChangeDetection(). zone.js is not loaded. Change detection is driven purely by signal writes and the microtask scheduler.

// app.config.ts
import { provideZonelessChangeDetection } from '@angular/core';

export const appConfig: ApplicationConfig = {
    providers: [
        provideZonelessChangeDetection(),
        // …
    ],
};

Benefits:

  • Smaller initial bundle (no zone.js polyfill)
  • Predictable rendering — only components consuming changed signals re-render
  • Simpler mental model — no hidden monkey-patching of setTimeout, fetch, etc.
  • Required for SSR edge runtimes that do not support zone.js

Gotcha: Never mutate state outside Angular's scheduler without calling signal.set(). Imperative DOM mutations (e.g. jQuery, direct innerHTML writes) will not trigger re-renders.


13. Multi-Mode SSR

Defined in src/app/app.routes.server.ts, Angular 21 supports three per-route rendering strategies:

ModeBehaviourBest for
RenderMode.PrerenderHTML generated once at build time (SSG)Fully static content
RenderMode.ServerHTML rendered per request inside the WorkerDynamic / user-specific pages
RenderMode.ClientNo server rendering, pure CSRRoutes with DOM-dependent Material components (e.g. mat-slide-toggle)
// app.routes.server.ts
import { RenderMode, ServerRoute } from '@angular/ssr';

export const serverRoutes: ServerRoute[] = [
    // Home and Compiler use CSR: mat-slide-toggle bound via ngModel
    // calls writeValue() during SSR, which crashes the server renderer.
    { path: '',        renderMode: RenderMode.Client },
    { path: 'compiler', renderMode: RenderMode.Client },
    // All other routes use per-request SSR.
    { path: '**',      renderMode: RenderMode.Server },
];

See: SSR and Rendering Modes for the full deployment picture.


14. Functional HTTP Interceptors

Replaces the class-based HttpInterceptor interface. Registered in provideHttpClient(withInterceptors([…])).

// interceptors/error.interceptor.ts
import { HttpInterceptorFn, HttpErrorResponse } from '@angular/common/http';
import { inject } from '@angular/core';
import { catchError, throwError } from 'rxjs';
import { AuthService } from '../services/auth.service';

export const errorInterceptor: HttpInterceptorFn = (req, next) => {
    const auth = inject(AuthService);

    return next(req).pipe(
        catchError((error: HttpErrorResponse) => {
            if (error.status === 401) {
                auth.clearKey();
            }
            return throwError(() => error);
        }),
    );
};

Registration:

// app.config.ts
provideHttpClient(withFetch(), withInterceptors([errorInterceptor]))

See: interceptors/error.interceptor.ts


15. Functional Route Guards

Replaces class-based CanActivate. A CanActivateFn is a plain function that returns boolean | UrlTree | Observable | Promise of those types.

// guards/admin.guard.ts
import { inject } from '@angular/core';
import { CanActivateFn, Router } from '@angular/router';
import { AuthService } from '../services/auth.service';

export const adminGuard: CanActivateFn = () => {
    const auth = inject(AuthService);
    // Soft check: the admin component renders an inline auth form if no key is set.
    // For strict blocking, return a UrlTree instead:
    //   return auth.hasKey() || inject(Router).createUrlTree(['/']);
    return true;
};

Registration (static import — recommended for new guards):

// app.routes.ts
import { adminGuard } from './guards/admin.guard';

{
    path: 'admin',
    loadComponent: () => import('./admin/admin.component').then(m => m.AdminComponent),
    canActivate: [adminGuard],
}

See: guards/admin.guard.ts, app.routes.ts


Component Catalog

ComponentRouteKey Patterns
AppComponentShell (no route)viewChild(), toSignal(), effect(), inject(), route animations
HomeComponent/@defer on viewport, MetricsStore, StatCardComponent, skeleton loading
CompilerComponent/compilerrxResource(), linkedSignal(), SseService, Turnstile, FilterParserService, CDK Virtual Scroll
PerformanceComponent/performancehttpResource(), MetricsStore, SparklineComponent
ValidationComponent/validationValidationService, color-coded output
ApiDocsComponent/api-docshttpResource()
AdminComponent/adminrxResource(), AuthService, CDK Virtual Scroll, D1 SQL console
StatCardComponentSharedinput.required(), output(), model()
SkeletonCardComponentSharedmat-card appearance="outlined" + mat-progress-bar (buffer mode), shimmer CSS animation, configurable line count
SkeletonTableComponentSharedmat-card appearance="outlined" + mat-progress-bar (buffer mode), shimmer CSS animation, configurable rows/columns
SparklineComponentSharedmat-card appearance="outlined" wrapper, Canvas 2D line/area chart, zero dependencies
TurnstileComponentSharedmat-card appearance="outlined" wrapper, Cloudflare Turnstile CAPTCHA widget, TurnstileService
ErrorBoundaryComponentSharedReads GlobalErrorHandler signals, dismissible overlay

Services Catalog

ServiceScopeResponsibility
CompilerServicerootPOST /api/compile — returns Observable<CompileResponse>
SseServicerootGeneric fetch-based SSE client; returns SseConnection with events() / status() signals
MetricsServicerootGET /api/metrics, GET /api/health — returns Observables
ValidationServicerootPOST /api/validate — rule validation
StorageServicerootAdmin R2/D1 storage endpoints
AuthServicerootAdmin key management via sessionStorage
ThemeServicerootDark/light signal state; SSR-safe via inject(DOCUMENT)
TurnstileServicerootCloudflare Turnstile widget lifecycle + token signal
FilterParserServicerootWeb Worker bridge; result, isParsing, progress, error signals
SwrCacheServicerootGeneric stale-while-revalidate signal cache

State Management

The application uses Angular Signals for all state. There is no NgRx or other external state library.

Local Component State

Transient UI state (loading spinner, form values, open panels) lives in signal() fields on the component class:

readonly isOpen = signal(false);
readonly searchQuery = signal('');

Shared Singleton Stores

Cross-component state that must survive navigation lives in injectable stores (no NgModule needed):

// store/metrics.store.ts — shared by HomeComponent and PerformanceComponent
@Injectable({ providedIn: 'root' })
export class MetricsStore {
    private readonly swrCache = inject(SwrCacheService);

    private readonly metricsSwr = this.swrCache.get<ExtendedMetricsResponse>(
        'metrics',
        () => firstValueFrom(this.metricsService.getMetrics()),
        30_000,   // TTL: 30 s
    );

    // Expose read-only signals to consumers
    readonly metrics = this.metricsSwr.data;
    readonly isLoading = computed(() => this.metricsSwr.isRevalidating());
}

Stale-While-Revalidate Cache

SwrCacheService backs MetricsStore. On first access it fetches data and caches it. On subsequent accesses it returns the cached value immediately and revalidates in the background if the TTL has elapsed.

First call          → cache MISS  → fetch  → store data in signal → render
Second call (fresh) → cache HIT   → return immediately
Second call (stale) → cache HIT   → return stale immediately + revalidate in background → signal updates

Signal Store Pattern

graph LR
    A[Component A] -->|inject| S[MetricsStore]
    B[Component B] -->|inject| S
    S -->|get| C[SwrCacheService]
    C -->|firstValueFrom| M[MetricsService]
    M -->|HTTP GET| API[/api/metrics]
    C -->|data signal| S
    S -->|readonly signal| A
    S -->|readonly signal| B

Routing

All routes use lazy loading via loadComponent(). The Angular build pipeline emits a separate JS chunk per route that is only fetched when the user navigates to that route.

// app.routes.ts
export const routes: Routes = [
    {
        path: '',
        loadComponent: () => import('./home/home.component').then(m => m.HomeComponent),
        title: 'Home',
    },
    {
        path: 'compiler',
        loadComponent: () => import('./compiler/compiler.component').then(m => m.CompilerComponent),
        title: 'Compiler',
        data: { description: 'Configure and run filter list compilations' },
    },
    {
        path: 'api-docs',
        loadComponent: () => import('./api-docs/api-docs.component').then(m => m.ApiDocsComponent),
        title: 'API Reference',
    },
    // … more routes
    {
        path: 'admin',
        loadComponent: () => import('./admin/admin.component').then(m => m.AdminComponent),
        canActivate: [() => import('./guards/admin.guard').then(m => m.adminGuard)],
        title: 'Admin',
    },
    { path: '**', redirectTo: '' },
];

Route title values are short labels (e.g. 'Compiler'). The AppTitleStrategy appends the application name automatically, producing titles like "Compiler | Adblock Compiler" (see Page Titles below).

Router features enabled:

FeatureProvider optionEffect
Component input bindingwithComponentInputBinding()Route params auto-bound to input() signals
View Transitions APIwithViewTransitions()Native browser cross-document transition animations
Preload allwithPreloading(PreloadAllModules)All lazy chunks prefetched after initial navigation
Custom title strategy{ provide: TitleStrategy, useClass: AppTitleStrategy }Appends app name to every route title (WCAG 2.4.2)

Page Titles

src/app/title-strategy.ts implements a custom TitleStrategy that formats every page's <title> element as:

<route title> | Adblock Compiler

When a route has no title, the fallback is just "Adblock Compiler". This satisfies WCAG 2.4.2 (Page Titled — Level A).

// title-strategy.ts
@Injectable({ providedIn: 'root' })
export class AppTitleStrategy extends TitleStrategy {
    private readonly title = inject(Title);

    override updateTitle(snapshot: RouterStateSnapshot): void {
        const routeTitle = this.buildTitle(snapshot);
        this.title.setTitle(routeTitle ? `${routeTitle} | Adblock Compiler` : 'Adblock Compiler');
    }
}

Register it in app.config.ts:

{ provide: TitleStrategy, useClass: AppTitleStrategy }

SSR and Rendering Modes

graph TD
    REQ[Incoming Request] --> CFW[Cloudflare Worker<br/>server.ts]
    CFW --> ASSET{Static asset?}
    ASSET -->|Yes| CDN[ASSETS binding<br/>CDN — no Worker invoked]
    ASSET -->|No| AE[AngularAppEngine.handle]
    AE --> ROUTE{Route render mode}
    ROUTE -->|Prerender| SSG[Serve pre-built HTML<br/>from ASSETS binding]
    ROUTE -->|Server| SSR[Render in Worker isolate<br/>AngularAppEngine]
    ROUTE -->|Client| CSR[Serve app shell HTML<br/>browser renders]
    SSR --> CSP[Inject CSP + security headers]
    CSP --> RESP[Response to browser]
    SSG --> RESP
    CSR --> RESP

Cloudflare Workers Entry Point (server.ts)

import { AngularAppEngine } from '@angular/ssr';
import './src/main.server';   // registers the app with AngularAppEngine

const angularApp = new AngularAppEngine();

export default {
    async fetch(request: Request): Promise<Response> {
        const response = await angularApp.handle(request);
        if (!response) return new Response('Not found', { status: 404 });

        // Inject security headers on HTML responses
        if (response.headers.get('Content-Type')?.includes('text/html')) {
            const headers = new Headers(response.headers);
            headers.set('Content-Security-Policy', /* … see Security section */);
            headers.set('X-Content-Type-Options', 'nosniff');
            headers.set('X-Frame-Options', 'DENY');
            headers.set('Referrer-Policy', 'strict-origin-when-cross-origin');
            return new Response(response.body, { status: response.status, headers });
        }

        return response;
    },
};

SSR vs CSR vs Prerender

StrategyWhen to useExample route
RenderMode.ServerDynamic content, user-specific data/admin, /performance, /api-docs
RenderMode.PrerenderStatic content, SEO landing pages
RenderMode.ClientComponents with DOM-dependent Material widgets (e.g. mat-slide-toggle)/ (Home), /compiler

HTTP Transfer Cache

provideClientHydration(withHttpTransferCacheOptions({ includePostRequests: false })) prevents double-fetching: data fetched during SSR is serialised into the HTML payload and replayed client-side without a second network request.


Accessibility (WCAG 2.1)

The Angular frontend targets WCAG 2.1 Level AA compliance. The following features are implemented:

FeatureLocationStandard
Skip navigation linkapp.component.htmlWCAG 2.4.1 — Bypass Blocks
Unique per-route page titlesAppTitleStrategyWCAG 2.4.2 — Page Titled
Single <h1> per pageRoute componentsWCAG 1.3.1 — Info and Relationships
aria-label on <nav>app.component.htmlWCAG 4.1.2 — Name, Role, Value
aria-live="polite" on toast containernotification-container.component.tsWCAG 4.1.3 — Status Messages
aria-hidden="true" on decorative iconsHome, Admin, Compiler componentsWCAG 1.1.1 — Non-text Content
.visually-hidden utility classstyles.cssScreen-reader-only text pattern
prefers-reduced-motion media querystyles.cssWCAG 2.3.3 — Animation from Interactions
id="main-content" on <main>app.component.htmlSkip link target

The app shell renders a visually-hidden skip link as the first focusable element on every page:

<a class="skip-link" href="#main-content">Skip to main content</a>
<!-- … header/nav … -->
<main id="main-content" tabindex="-1">
    <router-outlet />
</main>

The .skip-link class in styles.css positions it off-screen until focused, then brings it into view for keyboard users.

Reduced Motion

All CSS transitions and animations respect the user's OS preference:

@media (prefers-reduced-motion: reduce) {
    *, *::before, *::after {
        animation-duration: 0.01ms !important;
        transition-duration: 0.01ms !important;
    }
}

Security

Content Security Policy

server.ts injects the following CSP on all HTML responses:

DirectiveValueRationale
default-src'self'Block everything by default
script-src'self' + Cloudflare originsAllow app scripts + Turnstile
style-src'self' 'unsafe-inline'Material's inline styles
img-src'self' data:Allow inline SVG/data URIs
font-src'self'npm-bundled fonts only
connect-src'self'API calls to same origin
frame-srchttps://challenges.cloudflare.comTurnstile iframe
object-src'none'Block plugins
base-uri'self'Prevent base-tag injection

Bot Protection (Cloudflare Turnstile)

TurnstileService manages the widget lifecycle. CompilerComponent gates form submission on a valid Turnstile token:

// compiler.component.ts
submit(): void {
    const token = this.turnstileService.token();
    if (!token && this.turnstileSiteKey) {
        console.warn('Turnstile token not yet available');
        return;
    }
    this.pendingRequest.set({ /* … token included */ });
}

TURNSTILE_SITE_KEY is provided via an InjectionToken. An empty string disables the widget for local development.

Admin Authentication

AuthService stores the admin API key in sessionStorage (cleared on tab close). The errorInterceptor automatically clears the key on HTTP 401 responses.


Testing

Unit Tests (Vitest)

Tests use Vitest with @analogjs/vitest-angular instead of Karma + Jasmine. All tests are zoneless and use provideZonelessChangeDetection().

// stat-card.component.spec.ts
import { TestBed } from '@angular/core/testing';
import { provideZonelessChangeDetection } from '@angular/core';
import { StatCardComponent } from './stat-card.component';

describe('StatCardComponent', () => {
    it('renders required label input', async () => {
        await TestBed.configureTestingModule({
            imports: [StatCardComponent],
            providers: [provideZonelessChangeDetection()],
        }).compileComponents();

        const fixture = TestBed.createComponent(StatCardComponent);

        // Signal input setter API (replaces fixture.debugElement.setInput)
        fixture.componentRef.setInput('label', 'Filter Lists');

        await fixture.whenStable();   // flush microtask scheduler (replaces fixture.detectChanges())
        expect(fixture.nativeElement.textContent).toContain('Filter Lists');
    });
});

Testing HTTP services:

// compiler.service.spec.ts
import { provideHttpClient } from '@angular/common/http';
import { provideHttpClientTesting, HttpTestingController } from '@angular/common/http/testing';
import { API_BASE_URL } from '../tokens';

beforeEach(async () => {
    await TestBed.configureTestingModule({
        providers: [
            provideZonelessChangeDetection(),
            provideHttpClient(),
            provideHttpClientTesting(),
            { provide: API_BASE_URL, useValue: '/api' },
        ],
    }).compileComponents();

    httpTesting = TestBed.inject(HttpTestingController);
});

it('POSTs to /api/compile', () => {
    service.compile(['https://example.com/list.txt'], ['Deduplicate'])
        .subscribe(result => expect(result.success).toBe(true));

    const req = httpTesting.expectOne('/api/compile');
    expect(req.request.method).toBe('POST');
    req.flush({ success: true, ruleCount: 42, sources: 1, transformations: [], message: 'OK' });
});

Test commands:

npm test               # vitest run  — single pass
npm run test:watch     # vitest      — watch mode
npm run test:coverage  # coverage report in coverage/index.html

Coverage config (vitest.config.ts): provider v8, reporters ['text', 'json', 'html'], includes src/app/**/*.ts, excludes *.spec.ts.

E2E Tests (Playwright)

Located in src/e2e/. Tests target the dev server at http://localhost:4200.

# Run all E2E tests (dev server must be running)
npx playwright test

# Run a specific spec
npx playwright test src/e2e/home.spec.ts

Spec files:

FileCovers
home.spec.tsDashboard renders, stat cards, defer blocks
compiler.spec.tsForm submission, SSE stream, transformation checkboxes
navigation.spec.tsSidenav links, route transitions, 404 redirect

Cloudflare Workers Deployment

graph LR
    subgraph Build
        B1[ng build] --> B2[Angular SSR bundle<br/>dist/frontend/server/]
        B1 --> B3[Static assets<br/>dist/frontend/browser/]
    end
    subgraph Deploy
        B2 --> WD[wrangler deploy]
        B3 --> WD
        WD --> CF[Cloudflare Workers<br/>300+ edge locations]
    end
    subgraph Runtime
        CF --> ASSETS[ASSETS binding<br/>CDN — JS / CSS / fonts]
        CF --> SSR[Worker isolate<br/>server.ts — HTML]
    end

wrangler.toml Key Settings

name            = "adblock-compiler-frontend"
main            = "dist/frontend/server/server.mjs"
compatibility_date = "2025-01-01"

[assets]
directory = "dist/frontend/browser"
binding   = "ASSETS"

Build and Deploy Steps

# 1. Full production build (SSR bundle + static assets)
#    The `postbuild` npm lifecycle hook runs automatically after ng build,
#    copying index.csr.html → index.html so the ASSETS binding serves the SPA shell.
npm run build

# 2. Preview locally (mirrors Workers runtime exactly)
npm run preview        # wrangler dev → http://localhost:8787

# 3. Deploy to production
deno task wrangler:deploy  # wrangler deploy

Note: RenderMode.Client routes cause Angular's SSR builder to emit index.csr.html (CSR = client-side render) instead of index.html. The scripts/postbuild.js script copies it to index.html so the Cloudflare Worker ASSETS binding and Cloudflare Pages can locate the SPA shell. A src/_redirects file (/* /index.html 200) provides the SPA fallback rule for Cloudflare Pages deployments.

Edge Compatibility

server.ts uses only the standard fetch Request/Response API and @angular/ssr's AngularAppEngine. It is compatible with any WinterCG-compliant runtime:

  • ✅ Cloudflare Workers
  • ✅ Deno Deploy
  • ✅ Fastly Compute
  • ✅ Node.js (with @hono/node-server or similar adapter)

Configuration Tokens

Declared in src/app/tokens.ts. Provide overrides in app.config.ts (browser) or app.config.server.ts (SSR).

TokenTypeDefaultDescription
API_BASE_URLstring'/api'Base URL for all HTTP service calls. SSR overrides this to an absolute Worker URL to avoid same-origin issues.
TURNSTILE_SITE_KEYstring''Cloudflare Turnstile public site key. Empty string disables the widget.

How to override:

// app.config.server.ts (SSR only)
import { mergeApplicationConfig } from '@angular/core';
import { appConfig } from './app.config';

const serverConfig: ApplicationConfig = {
    providers: [
        // Absolute URL required in the Worker isolate
        { provide: API_BASE_URL, useValue: 'https://adblock-compiler.workers.dev/api' },
    ],
};

export const config = mergeApplicationConfig(appConfig, serverConfig);

Extending the Frontend

Adding a New Page

  1. Create src/app/my-feature/my-feature.component.ts (standalone component).
  2. Add a lazy route in app.routes.ts:
    {
        path: 'my-feature',
        loadComponent: () => import('./my-feature/my-feature.component').then(m => m.MyFeatureComponent),
        title: 'My Feature - Adblock Compiler',
    }
    
  3. Add a nav item in app.component.ts:
    { path: '/my-feature', label: 'My Feature', icon: 'star' }
    
  4. Add a server render mode in app.routes.server.ts if needed (the catch-all ** covers new routes automatically).

Adding a New Service

  1. Create src/app/services/my.service.ts:
    import { Injectable, inject } from '@angular/core';
    import { HttpClient } from '@angular/common/http';
    import { Observable } from 'rxjs';
    import { API_BASE_URL } from '../tokens';
    
    @Injectable({ providedIn: 'root' })
    export class MyService {
        private readonly http = inject(HttpClient);
        private readonly baseUrl = inject(API_BASE_URL);
    
        getData(): Observable<MyResponse> {
            return this.http.get<MyResponse>(`${this.baseUrl}/my-endpoint`);
        }
    }
    
  2. Inject in components with inject(MyService) — no module registration needed.
  3. Add src/app/services/my.service.spec.ts with provideHttpClientTesting().

Adding a New Shared Component

  1. Create src/app/my-widget/my-widget.component.ts as a standalone component.
  2. Implement input(), output(), or model() for the public API.
  3. Import it directly in any consuming component's imports: [MyWidgetComponent].

Migration Reference (v16 → v21)

PatternAngular ≤ v16Angular 21
Component inputs@Input() label!: stringreadonly label = input.required<string>()
Component outputs@Output() clicked = new EventEmitter<string>()readonly clicked = output<string>()
Two-way binding@Input() val + @Output() valChangereadonly val = model<T>()
View queries@ViewChild('ref') el!: ElementRefreadonly el = viewChild<ElementRef>('ref')
Async dataObservable + manual subscribe + ngOnDestroyrxResource() / httpResource()
Linked stateeffect() writing a signallinkedSignal()
Post-render DOMngAfterViewInitafterRenderEffect()
App initAPP_INITIALIZER tokenprovideAppInitializer()
Observable → templateAsyncPipetoSignal()
Subscription teardownSubject<void> + ngOnDestroytakeUntilDestroyed(destroyRef)
Lazy renderingNone@defer with triggers
Change detectionZone.jsprovideZonelessChangeDetection()
SSR serverExpress.jsCloudflare Workers AngularAppEngine fetch handler
DI styleConstructor paramsinject() functional DI
NgModulesRequiredStandalone components (no modules)
HTTP interceptorsClass HttpInterceptorFunctional HttpInterceptorFn
Route guardsClass CanActivateFunctional CanActivateFn
Structural directives*ngIf, *ngFor, *ngSwitch@if, @for, @switch
Test runnerKarma + JasmineVitest + @analogjs/vitest-angular
FontsGoogle Fonts CDN@fontsource / material-symbols npm packages

Further Reading

Angular 21 Feature Parity Checklist

Purpose: Definitive audit confirming every feature, page, link, theme, and API endpoint from the legacy HTML/CSS frontend exists and functions correctly in the Angular 21 SPA.

Status: ✅ All items verified — zero untracked regressions.

Last reviewed: 2026-03-08


Table of Contents

  1. Pages & Routes
  2. Feature Parity by Page
  3. Theme Consistency
  4. Navigation Links & External References
  5. Mobile / Responsive Layout
  6. API Endpoints
  7. Regressions & Known Gaps

1. Pages & Routes

Maps every legacy static HTML page to its Angular 21 equivalent.

Legacy FileURLAngular RouteComponentStatus
index.html (Admin Dashboard)//HomeComponent
compiler.html/compiler.html/compilerCompilerComponent
admin-storage.html/admin-storage.html/adminAdminComponent
test.html/test.html/ + /api-docsApiTesterComponent + ApiDocsComponent
validation-demo.html/validation-demo.html/validationValidationComponent
websocket-test.html/websocket-test.html/api-docsApiDocsComponent (endpoint docs)⚠️ See §7
e2e-tests.html/e2e-tests.htmlN/A (Playwright in /e2e/)⚠️ See §7
/performancePerformanceComponent✅ (new in Angular)

Legacy → Angular route redirect coverage: All old URL paths that browsers may have bookmarked are handled by the SPA fallback in worker.ts — unknown paths redirect to /.


2. Feature Parity by Page

2.1 Dashboard — / (HomeComponent)

Maps to legacy index.html (Admin Dashboard).

FeatureLegacy index.htmlAngular HomeComponentStatus
System status bar (health check)
Total Requests metric card
Queue Depth metric card
Cache Hit Rate metric card
Avg Response Time metric card
Queue depth count card (5th card)✅ (new)
Queue depth chart (Chart.js)✅ (SVG via QueueChartComponent)
Quick-action buttons (compile, batch, async)
Navigation grid (tools & pages)
Endpoint comparison table
Inline API tester✅ (test.html)✅ (ApiTesterComponent)
Notification settings toggle✅ (NotificationService)
Auto-refresh toggle + configurable interval
Manual "Refresh" button✅ (MetricsStore.refresh())
Skeleton loading placeholders✅ (SkeletonCardComponent)✅ (improved)

2.2 Compiler — /compiler (CompilerComponent)

Maps to legacy compiler.html.

FeatureLegacy compiler.htmlAngular CompilerComponentStatus
JSON compilation mode
SSE streaming mode
Async / queued mode
Batch compilation mode
Batch + Async mode✅ (new)
Preset selector✅ (linkedSignal() URL defaults)
Add/remove source URL fields✅ (reactive FormArray)
Transformation checkboxes
Benchmark flag
Real-time queue stats panel✅ (shown for async modes)✅ (new)
Compilation result display✅ (CDK Virtual Scroll)
File drag-and-drop upload✅ (Web Worker parsing)✅ (new)
Turnstile bot protection✅ (TurnstileComponent)✅ (new)
Progress indication✅ (MatProgressBar)
Log / notification integration✅ (LogService, NotificationService)✅ (new)

2.3 Performance — /performance (PerformanceComponent)

No direct legacy equivalent; functionality was previously spread across the dashboard.

FeatureLegacyAngular PerformanceComponentStatus
System health statuspartial (/metrics call)✅ (/health/latest)
Uptime display✅ (new)
Per-endpoint request counts✅ (index.html metrics)✅ (MatTable)
Per-endpoint success/failure✅ (new)
Per-endpoint avg duration✅ (new)
Sparkline charts per endpoint✅ (SparklineComponent)✅ (new)
Auto-refresh via MetricsStorepartial

2.4 Validation — /validation (ValidationComponent)

Maps to legacy validation-demo.html.

FeatureLegacy validation-demo.htmlAngular ValidationComponentStatus
Multi-line rules textarea
Rule count hint
Strict mode toggle
Validate button with spinner
Color-coded error/warning/ok output
Pass/fail summary chips
Per-rule AGTree parse errors✅ (ValidationService)

2.5 API Reference — /api-docs (ApiDocsComponent)

Maps to legacy inline API docs (in index.html) and the standalone /api JSON endpoint.

FeatureLegacyAngular ApiDocsComponentStatus
Endpoint list with methods✅ (HTML list)✅ (grouped cards)
Compilation endpoints
Monitoring endpoints
Queue management endpoints
Workflow endpoints✅ (new)
Validation endpoint✅ (new)
Admin endpoints
Live version display (/api/version)✅ (httpResource())✅ (new)
Built-in API tester (send requests)partial (test.html)
cURL example generation

2.6 Admin — /admin (AdminComponent)

Maps to legacy admin-storage.html.

FeatureLegacy admin-storage.htmlAngular AdminComponentStatus
Auth gate (X-Admin-Key)✅ (AuthService, adminGuard)
Authenticated status bar
Storage stats (KV / R2 / D1 counts)✅ (StorageService)
D1 table list
Read-only SQL query console✅ (CDK Virtual Scroll results)
Clear expired entries
Clear cache
Vacuum D1 database
Skeleton loading state✅ (SkeletonCardComponent)✅ (improved)

3. Theme Consistency

RequirementImplementationStatus
Dark / light theme toggleThemeService — persists in localStorage, applies dark-theme class + data-theme attribute to <body>
Theme toggle in toolbarAppComponent toolbar button, accessible via keyboard
No flash of unstyled content (FOUC)loadPreferences() runs in constructor before first render
Consistent theme across all routesSingle ThemeService + Angular Material theming via CSS custom props
Compiler pageMaterial Design 3 color tokens, dark-theme class propagates
Dashboard / Home pageSame
Admin pageSame
Performance pageSame
Validation pageSame
API Docs pageSame

Internal Navigation

Link / ActionLegacyAngularStatus
Home / Dashboardindex.html/ via routerLink
Compilercompiler.html/compiler via routerLink
Performance/performance via routerLink
Validationvalidation-demo.html/validation via routerLink
API Docsindex.html#api/api-docs via routerLink
Adminadmin-storage.html/admin via routerLink
404 fallback** → redirect to /
Skip-to-main-content link✅ (<a href="#main-content">)✅ (a11y new)

Desktop / Mobile Navigation

Navigation PatternAngularStatus
Horizontal tab bar (desktop)routerLink + routerLinkActive tabs in toolbar
Slide-over sidenav (mobile)MatSidenav (mode="over") with hamburger button
Active route highlightrouterLinkActive="active-nav-item"

External References

LinkDestinationLocation in AngularStatus
GitHub repositoryhttps://github.com/jaypatrick/adblock-compilerAppComponent footer
JSR package@jk-com/adblock-compiler (via GitHub link)Footer
Live service URLhttps://adblock-compiler.jayson-knight.workers.dev/— (API calls use relative paths)

5. Mobile / Responsive Layout

RequirementImplementationStatus
Slide-over navigation drawer on mobileMatSidenav mode="over" in AppComponent
Hamburger menu buttonShown on small viewports (<= 768 px) via CSS display
Desktop horizontal tabs hidden on mobileCSS media query hides .app-nav-tabs
Stat cards responsive gridCSS grid with auto-fill / minmax
Compiler form adapts to narrow screensMatFormField full-width, stacked layout
Admin SQL console wraps correctlyCDK Virtual Scroll with overflow handling
Navigation grid auto-reflowCSS grid auto-fill
Table horizontal scrolloverflow-x: auto wrapper on all MatTable

6. API Endpoints

All worker API endpoints surfaced in the Angular frontend (called from services and documented in ApiDocsComponent).

6.1 Compilation

EndpointWorkerAngular ConsumerStatus
POST /compileCompilerService.compile()
POST /compile/streamSseService + CompilerService.stream()
POST /compile/batchCompilerService.batch()
POST /compile/asyncCompilerService.compileAsync()
POST /compile/batch/asyncCompilerService.batchAsync()
GET /ws/compileDocumented in /api-docs⚠️ See §7
POST /ast/parseApiDocsComponent tester

6.2 Monitoring & Health

EndpointWorkerAngular ConsumerStatus
GET /healthMetricsStore (health polling)
GET /health/latestPerformanceComponent (httpResource)
GET /metricsMetricsStore / MetricsService
GET /apiApiDocsComponent
GET /api/versionApiDocsComponent (httpResource)
GET /api/deploymentsDocumented in /api-docs
GET /api/deployments/statsDocumented in /api-docs

6.3 Queue Management

EndpointWorkerAngular ConsumerStatus
GET /queue/statsQueueService, MetricsStore
GET /queue/historyQueueService, QueueChartComponent
GET /queue/results/:requestIdCompilerService (async polling)
POST /queue/cancel/:requestIdCompilerService.cancelJob()

6.4 Workflow (Durable Execution)

EndpointWorkerAngular ConsumerStatus
POST /workflow/compileApiDocsComponent (documented)
POST /workflow/batchApiDocsComponent (documented)
GET /workflow/status/:instanceIdApiDocsComponent (documented)
GET /workflow/metricsApiDocsComponent (documented)
GET /workflow/events/:instanceIdApiDocsComponent (documented)
POST /workflow/cache-warmApiDocsComponent (documented)
POST /workflow/health-checkApiDocsComponent (documented)

6.5 Validation

EndpointWorkerAngular ConsumerStatus
POST /api/validateValidationService

6.6 Admin Storage (auth-gated)

EndpointWorkerAngular ConsumerStatus
GET /admin/storage/statsStorageService.getStats()
GET /admin/storage/tablesStorageService.getTables()
POST /admin/storage/queryStorageService.query()
POST /admin/storage/clear-expiredStorageService.clearExpired()
POST /admin/storage/clear-cacheStorageService.clearCache()
POST /admin/storage/vacuumStorageService.vacuum()
GET /admin/storage/exportApiDocsComponent (documented)

6.7 Configuration

EndpointWorkerAngular ConsumerStatus
GET /api/turnstile-configTurnstileService

7. Regressions & Known Gaps

7.1 websocket-test.html — No Dedicated Angular Route

Legacy: A standalone HTML page at /websocket-test.html provided an interactive WebSocket client to exercise the GET /ws/compile endpoint.

Angular status: There is no dedicated Angular route for WebSocket testing.

Mitigation:

  • The GET /ws/compile endpoint is fully documented in the /api-docs route with method, path, and description.
  • The endpoint remains operational in the Worker.
  • Manual testing can be performed using browser DevTools or wscat.

Recommendation: If interactive WebSocket testing is desired in the SPA, add a /ws-test route with a WsTestComponent that opens a WebSocket and displays send/receive frames. Log this as a child issue if needed.

Severity: Low — endpoint unchanged; only the interactive HTML tester is absent.


7.2 e2e-tests.html — Test Runner Removed from Production SPA

Legacy: An HTML page at /e2e-tests.html embedded a browser-based end-to-end test runner that could be opened in any browser to run API integration tests.

Angular status: Not ported to the Angular SPA. End-to-end tests now live in frontend/e2e/ and are executed with Playwright (npm run e2e).

Mitigation:

  • Playwright tests in frontend/e2e/ cover the same navigation and API scenarios.
  • The e2e-tests.html approach was a development/debug convenience, not a production feature used by end-users.

Recommendation: Keep Playwright as the canonical e2e mechanism. The HTML test runner is not required in the production SPA.

Severity: Low — test coverage maintained via Playwright; no user-facing regression.


Summary

CategoryTotal Items✅ Present⚠️ Gap / Notes
Pages / Routes862 (see §7)
Dashboard features14140
Compiler features14140
Performance features770
Validation features770
API Docs features10100
Admin features990
Theme items10100
Navigation / links14140
Responsive layout880
API endpoints30291 (/ws/compile not surfaced as interactive UI)
Total1311283

All three gaps are low-severity development/debug conveniences with documented mitigations. There are zero untracked regressions in user-facing functionality.

SPA Benefits Analysis — Adblock Compiler

Question: Would This App Benefit From Being a Single Page Application?

Short answer: Yes.

The Adblock Compiler is currently a multi-page application (MPA) where each public/*.html file is an independent page that triggers a full browser reload on every navigation. Converting to a Single Page Application (SPA) would meaningfully improve the user experience, developer experience, and long-term maintainability.


Current Architecture (Multi-Page)

public/
├── index.html          ← Admin dashboard
├── compiler.html       ← Compiler UI
├── admin-storage.html  ← Storage admin
├── test.html           ← API tester
├── e2e-tests.html      ← E2E test runner
├── validation-demo.html
└── websocket-test.html

Each page is isolated. Navigation between them triggers a full browser reload, re-downloads shared CSS/JS, and discards all in-memory state (form inputs, results, theme settings not yet flushed to localStorage).


SPA Benefits

1. Instant Navigation (No Full-Page Reloads)

In the current MPA, clicking "Compiler" from the dashboard causes the browser to:

  1. Send a new HTTP request
  2. Download and parse compiler.html
  3. Re-download shared CSS and JS modules
  4. Re-initialise theme, chart libraries, and event listeners

With a SPA, navigation is handled entirely in JavaScript — the URL changes, the current "page" component is swapped out, and the rest of the shell (navigation, theme, cached data) stays intact. Page transitions feel instant.

2. Shared State Across Views

With an MPA, sharing data between pages requires localStorage, sessionStorage, URL parameters, or a server round-trip. With a SPA, all views share the same JavaScript heap:

compiler result → still in memory when navigating to "Test" page
theme selection → applied once, persisted in the Vue/Angular state
API health data → fetched once at app startup, reused everywhere

This eliminates redundant API calls and simplifies state management.

3. Component Reusability and DRY Code

The current pages duplicate:

  • Theme toggle HTML, CSS, and JS (repeated in every .html file)
  • Navigation markup and link styling
  • Shared CSS variable declarations
  • Loading spinner HTML patterns

A SPA consolidates these into reusable components that render once and are shared across all views. Changes to the navigation or theme toggle are made in one place.

4. Code Splitting and Lazy Loading

Modern SPA frameworks paired with Vite automatically split the app bundle by route. Code for the "Admin Storage" page is never downloaded unless the user navigates there. This improves Time to Interactive (TTI) for all users.

The existing Vite configuration already supports this via @vitejs/plugin-vue — no additional tooling changes are required.

5. Better Loading UX

SPAs enable skeleton screens, optimistic updates, and progressive loading that are impossible with full-page reloads:

  • Show the navigation shell instantly
  • Stream in stats as they arrive from the API
  • Display "Compiling…" inline without a blank white flash

6. Improved Testability

Component-based SPAs are significantly easier to unit test:

  • Each component can be rendered in isolation
  • State changes are predictable and inspectable
  • Mocking API calls is straightforward
  • End-to-end tests navigate within the same page context (no cross-page coordination)

7. Mobile and PWA Readiness

SPAs are the natural foundation for Progressive Web Apps (PWAs). Adding a service worker for offline support, app-shell caching, and push notifications is straightforward once the app is already an SPA.


Why the Infrastructure Is Already Ready

The Vite build system already ships @vitejs/plugin-vue:

// vite.config.ts (excerpt)
import vue from '@vitejs/plugin-vue';
import vueJsx from '@vitejs/plugin-vue-jsx';

export default defineConfig({
    plugins: [vue(), vueJsx()],
    // ...
});

This means .vue Single-File Components can already be imported and bundled without any additional tooling changes. Adding a new SPA entry point requires only:

  1. A new *.html entry in vite.config.ts rollupOptions.input
  2. A main.ts that mounts the Vue root
  3. Route components for each current page

Phase 1 — Add a Vue SPA entry (lowest risk)

Add a new public/app.html entry that mounts a Vue 3 SPA alongside the existing MPA pages. Users can opt in to the new SPA experience while the existing pages remain untouched.

Phase 2 — Migrate pages incrementally

Migrate pages one at a time from static HTML into Vue route components:

  1. Home dashboard (index.html/) — stats, chart, health status
  2. Compiler (compiler.html/compiler) — form, results, SSE streaming
  3. Test (test.html/test) — API test runner
  4. Admin Storage (admin-storage.html/admin) — storage management

Phase 3 — Remove legacy pages

Once all pages are ported and the SPA is stable, the legacy .html files can be removed and the SPA entry can become the single index.html.


Framework Recommendation

For this project, Vue 3 is the recommended choice:

CriterionVue 3Angular
Learning curveLowHigh
Bundle sizeSmallLarge
TypeScriptOptional (excellent)Required
Official router✅ Vue Router 4✅ Angular Router
State management✅ Pinia (official)✅ Signals + RxJS
Vite integration✅ First-classPartial
Cloudflare Workers

Vue 3 balances a low learning curve, excellent TypeScript support, first-class Vite integration, and an official router and state management library. The project's existing Vite setup already has @vitejs/plugin-vue installed and active.


Tailwind CSS v4 Integration

This document explains how Tailwind CSS v4 is integrated into the Angular frontend.

Overview

Tailwind CSS v4 has been integrated into the Angular 21 frontend using a CSS-first, PostCSS-based approach. v4 introduces significant changes from v3:

  • No config file required — configuration lives in CSS via @theme and @custom-variant
  • Single import@import "tailwindcss" replaces the three @tailwind directives
  • New PostCSS plugin — uses @tailwindcss/postcss instead of tailwindcss directly
  • Automatic content scanning — no content array needed in config

Configuration Files

.postcssrc.json

PostCSS configuration using the v4 plugin:

{
  "plugins": {
    "@tailwindcss/postcss": {}
  }
}

src/styles.css

Tailwind is imported at the top of the global stylesheet, before Angular Material:

@import "tailwindcss";

@custom-variant dark (&:where(body.dark-theme *, [data-theme='dark'] *));

The @custom-variant dark selector matches the existing ThemeService dark mode selectors (body.dark-theme class and html[data-theme='dark'] attribute).

Material Design 3 Bridge (@theme inline)

The integration's key feature is a @theme inline block that maps Angular Material's M3 role tokens to Tailwind CSS custom properties. This makes every Material token available as a semantic Tailwind utility class.

@theme inline {
    --color-primary: var(--mat-sys-primary);
    --color-on-surface: var(--mat-sys-on-surface);
    --color-surface-variant: var(--mat-sys-surface-variant);
    --color-on-surface-variant: var(--mat-sys-on-surface-variant);
    --color-error: var(--mat-sys-error);
    --color-outline: var(--mat-sys-outline);
    --font-sans: 'IBM Plex Sans', sans-serif;
    --font-mono: 'JetBrains Mono', monospace;
    --font-display: 'Syne', sans-serif;
    /* ... full list in styles.css */
}

Why inline?

The inline keyword tells Tailwind v4 to resolve values at runtime rather than build time. This is essential for integration with Angular Material M3 tokens, whose CSS custom properties change value when the dark theme is applied — ensuring dark mode works correctly with all generated Tailwind utilities.

Generated utilities

Every --color-* entry generates bg-*, text-*, border-*, ring-*, and fill-* utilities. Every --font-* entry generates font-* utilities.

CSS variableExample Tailwind classes
--color-primarybg-primary, text-primary, border-primary
--color-on-surfacetext-on-surface
--color-surface-variantbg-surface-variant
--color-on-surface-varianttext-on-surface-variant
--color-errortext-error, border-error
--color-tertiarytext-tertiary
--color-outlineborder-outline
--font-sansfont-sans (IBM Plex Sans)
--font-monofont-mono (JetBrains Mono)
--font-displayfont-display (Syne)

Usage in Components

Angular components use Tailwind utility classes directly in their inline templates.

Semantic color classes (preferred)

Use the bridged Material token utilities instead of arbitrary CSS variable values:

<!-- ✅ Preferred: semantic Tailwind class via @theme inline bridge -->
<div class="bg-surface-variant text-on-surface-variant">...</div>

<!-- ❌ Avoid: arbitrary value syntax — brittle and verbose -->
<div class="bg-[var(--mat-sys-surface-variant)] text-[var(--mat-sys-on-surface-variant)]">...</div>

Layout and Spacing

<!-- Flex row with gap -->
<div class="flex items-center gap-4">
  <span>Item 1</span>
  <span>Item 2</span>
</div>

<!-- Responsive grid -->
<div class="grid grid-cols-[repeat(auto-fit,minmax(140px,1fr))] gap-4">
  <!-- Grid items -->
</div>

Skeleton Loaders

Skeleton components use Tailwind's animate-pulse utility with Material surface tokens:

<div class="h-[14px] rounded animate-pulse bg-surface-variant"></div>

Dark Mode

Tailwind dark mode is wired to the same selectors as the existing ThemeService. M3 token utilities (bg-primary, text-on-surface, etc.) automatically adapt because the underlying CSS variables change at runtime when the dark theme activates — no dark: prefix needed for Material-token-based utilities:

<!-- M3 tokens: dark mode handled automatically via CSS variable swap -->
<div class="bg-surface-variant text-on-surface-variant">Always correct</div>

<!-- Standard Tailwind colors: use dark: prefix -->
<div class="bg-white dark:bg-zinc-900">Custom palette value</div>

Integration Rules

ConcernUse
Layout (flex, grid, spacing)Tailwind utilities
Color (backgrounds, text, borders)Semantic classes via @theme inline bridge
Typography size/weightTailwind (text-sm, font-bold)
Font familyfont-sans, font-mono, font-display (bridged)
Angular Material componentsLeave to Material — do not override with Tailwind
Hover/focus transforms, complex stateComponent-scoped CSS in styles: []

Development Workflow

  1. Add Tailwind classes directly to Angular component inline templates
  2. Run ng serve — Angular CLI processes PostCSS automatically via .postcssrc.json
  3. No separate CSS build step required

Production

Angular CLI handles Tailwind's CSS tree-shaking automatically as part of the build process. Only classes used in component templates are included in the final bundle.

References

Validation UI Component

A comprehensive, color-coded UI component for displaying validation errors from AGTree-parsed filter rules.

Features

Color-Coded Error Types - Each error type has a unique color scheme for instant recognition 🎨 Syntax Highlighting - Filter rules are syntax-highlighted based on their type 🌳 AST Visualization - Interactive AST tree view with color-coded node types 🔍 Error Filtering - Filter by severity (All, Errors, Warnings) 📊 Summary Statistics - Visual cards showing validation metrics 📥 Export Capability - Download validation reports as JSON 🌙 Dark Mode - Full support for light and dark themes 📱 Responsive Design - Works on all screen sizes

Quick Start

Include the Script

<script src="validation-ui.js"></script>

Display a Validation Report

const report = {
    totalRules: 1000,
    validRules: 950,
    invalidRules: 50,
    errorCount: 45,
    warningCount: 5,
    infoCount: 0,
    errors: [
        {
            type: 'unsupported_modifier',
            severity: 'error',
            ruleText: '||example.com^$popup',
            message: 'Unsupported modifier: popup',
            details: 'Supported modifiers: important, ~important, ctag...',
            lineNumber: 42,
            sourceName: 'Custom Filter'
        }
    ]
};

ValidationUI.showReport(report);

Color Coding Guide

Error Types

Error TypeColorHex Code
Parse ErrorRed#dc3545
Syntax ErrorRed#dc3545
Unsupported ModifierOrange#fd7e14
Invalid HostnamePink#e83e8c
IP Not AllowedPurple#6610f2
Pattern Too ShortYellow#ffc107
Public Suffix MatchLight Red#ff6b6b
Invalid CharactersMagenta#d63384
Cosmetic Not SupportedCyan#0dcaf0

AST Node Types

Node TypeColorHex Code
Network CategoryBlue#0d6efd
Network RuleLight Blue#0dcaf0
Host RulePurple#6610f2
Cosmetic RulePink#d63384
ModifierOrange#fd7e14
CommentGray#6c757d
Invalid RuleRed#dc3545

Syntax Highlighting

Rules are automatically syntax-highlighted:

Network Rules

||example.com^$third-party
  ││          │ │
  │└──────────┘ │
  │   Domain    │
  │  (blue)     │
  └─────────────┘
    Separators
     (gray)
       └──────────────┘
          Modifiers
          (orange)

Exception Rules

@@||example.com^
││
│└─────────────────┘
│      Pattern
│      (blue)
└──────────────────┘
  Exception marker
     (green)

Host Rules

0.0.0.0 example.com
│       │
│       └──────────┘
│         Domain
│         (blue)
└──────────────────┘
   IP Address
   (purple)

API Reference

ValidationUI.showReport(report)

Display a validation report.

Parameters:

  • report (ValidationReport) - The validation report to display

Example:

ValidationUI.showReport({
    totalRules: 100,
    validRules: 95,
    invalidRules: 5,
    errorCount: 4,
    warningCount: 1,
    infoCount: 0,
    errors: [...]
});

ValidationUI.hideReport()

Hide the validation report section.

Example:

ValidationUI.hideReport();

ValidationUI.renderReport(report, container)

Render a validation report in a specific container element.

Parameters:

  • report (ValidationReport) - The validation report
  • container (HTMLElement) - Container element to render in

Example:

const container = document.getElementById('my-container');
ValidationUI.renderReport(report, container);

ValidationUI.downloadReport()

Download the current validation report as JSON.

Example:

// Add a button to trigger download
button.addEventListener('click', () => {
    ValidationUI.downloadReport();
});

Data Structures

ValidationReport

interface ValidationReport {
    errorCount: number;
    warningCount: number;
    infoCount: number;
    errors: ValidationError[];
    totalRules: number;
    validRules: number;
    invalidRules: number;
}

ValidationError

interface ValidationError {
    type: ValidationErrorType;
    severity: ValidationSeverity;
    ruleText: string;
    lineNumber?: number;
    message: string;
    details?: string;
    ast?: AnyRule;
    sourceName?: string;
}

ValidationErrorType

enum ValidationErrorType {
    parse_error = 'parse_error',
    syntax_error = 'syntax_error',
    unsupported_modifier = 'unsupported_modifier',
    invalid_hostname = 'invalid_hostname',
    ip_not_allowed = 'ip_not_allowed',
    pattern_too_short = 'pattern_too_short',
    public_suffix_match = 'public_suffix_match',
    invalid_characters = 'invalid_characters',
    cosmetic_not_supported = 'cosmetic_not_supported',
    modifier_validation_failed = 'modifier_validation_failed',
}

ValidationSeverity

enum ValidationSeverity {
    error = 'error',
    warning = 'warning',
    info = 'info',
}

Visual Examples

Summary Cards

The UI displays summary statistics in color-coded cards:

┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│  Total Rules    │ │   Valid Rules   │ │  Invalid Rules  │
│      1000       │ │       950       │ │       50        │
│    (purple)     │ │     (green)     │ │      (red)      │
└─────────────────┘ └─────────────────┘ └─────────────────┘

Error List Item

Each error is displayed with:

┌────────────────────────────────────────────────────────┐
│ [ERROR]  Unsupported Modifier  (Line 42) [Custom Filter] │
│                                                          │
│ Unsupported modifier: popup                             │
│ Supported modifiers: important, ctag, dnstype...        │
│                                                          │
│ ┌────────────────────────────────────────────────────┐ │
│ │ ||example.com^$popup                                │ │
│ │   └──────────┘ └─────┘                              │ │
│ │     domain   modifier (highlighted in red)          │ │
│ └────────────────────────────────────────────────────┘ │
│                                                          │
│ [🔍 Show AST]                                            │
└────────────────────────────────────────────────────────┘

AST Visualization

Expandable AST tree with color-coded nodes:

[NetworkRule]  (light blue badge)
  pattern: ||example.com^ (blue text)
  exception: false (red text)
  modifiers:
    [ModifierList] (orange badge)
      [0] [Modifier] (orange badge)
        name: popup (blue text)
        value: null (gray text)

Integration with Compiler

To integrate with the adblock-compiler:

// In your compilation workflow
const validator = new ValidateTransformation(false);
validator.setSourceName('My Filter List');

const validRules = validator.executeSync(rules);
const report = validator.getValidationReport(
    rules.length,
    validRules.length
);

// Display in UI
ValidationUI.showReport(report);

Demo Page

A demo page is included (validation-demo.html) that shows:

  • Color legend for error types
  • Color legend for AST node types
  • Sample validation reports
  • Dark mode toggle
  • Interactive examples

To view:

  1. Open validation-demo.html in a browser
  2. Click "Load Sample Report" to see examples
  3. Toggle dark mode to see theme adaptation
  4. Click on AST buttons to explore parsed structures

Browser Compatibility

  • Chrome/Edge: ✅ Full support
  • Firefox: ✅ Full support
  • Safari: ✅ Full support
  • Mobile browsers: ✅ Responsive design

Styling

The component uses CSS custom properties for theming:

:root {
    --alert-error-bg: #f8d7da;
    --alert-error-text: #721c24;
    --alert-error-border: #dc3545;
    --log-warn-bg: #fff3cd;
    --log-warn-text: #856404;
    --log-warn-border: #ffc107;
    /* ... etc */
}

Override these in your stylesheet to customize colors.

Contributing

When adding new error types:

  1. Add the error type to ValidationErrorType enum
  2. Add color scheme in getErrorTypeColor() method
  3. Add syntax highlighting logic in highlightRule() if needed
  4. Update documentation and demo

License

Part of the adblock-compiler project. See main project LICENSE.

Vite Integration

This document describes how Vite is used as the build tool for the Adblock Compiler frontend UI (the static files served by the Cloudflare Worker).

Overview

Vite processes all HTML pages in public/ as a multi-page application:

  • Bundles local JavaScript/TypeScript modules (public/js/)
  • Extracts and optimises CSS (including the shared design-system styles)
  • Replaces CDN Chart.js with a tree-shaken npm bundle
  • Outputs production-ready assets to dist/
  • Supports Vue 3 Single-File Components (.vue files) via @vitejs/plugin-vue
  • Supports Vue 3 JSX/TSX via @vitejs/plugin-vue-jsx
  • Supports React JSX/TSX with Fast Refresh via @vitejs/plugin-react

External scripts that must stay as CDN references (Cloudflare Web Analytics, Cloudflare Turnstile) are left untouched by Vite.

Plugins

PluginVersionPurpose
@vitejs/plugin-vue^6.0.4Vue 3 Single-File Component (.vue) support
@vitejs/plugin-vue-jsx^5.1.4Vue 3 JSX and TSX transform support
@vitejs/plugin-react^5.1.4React JSX/TSX transform with Babel Fast Refresh

All three plugins are active for every build and dev-server session. They have no impact on pages that do not import Vue or React components.

Directory Structure

public/                     ← Vite root (source files)
├── js/
│   ├── theme.ts            ← Dark/light mode toggle (ES module)
│   └── chart.ts            ← Chart.js npm import + global registration
├── shared-styles.css       ← Design-system CSS variables
├── validation-ui.js        ← Validation UI component (ES module)
├── index.html              ← Admin dashboard
├── compiler.html           ← Main compiler UI
├── test.html               ← API tester
├── admin-storage.html      ← Storage admin
├── e2e-tests.html          ← E2E test runner
├── validation-demo.html    ← Validation demo
└── websocket-test.html     ← WebSocket tester

dist/                       ← Vite build output (git-ignored)

Scripts

CommandDescription
npm run ui:devStart the Vite dev server on http://localhost:5173 with HMR
npm run ui:buildProduction build → dist/
npm run ui:previewServe the dist/ build locally for smoke-testing

Development Workflow

Option A — Vite dev server only (UI changes)

# Terminal 1: start the Cloudflare Worker backend
wrangler dev          # listens on http://localhost:8787

# Terminal 2: start the Vite dev server
npm run ui:dev        # proxies /api, /compile, /health, /ws → :8787

Open http://localhost:5173 in the browser. Hot-module replacement (HMR) means UI changes are reflected immediately without a full page reload.

Option B — Wrangler dev only (worker changes)

If you only need to iterate on the Worker code and the UI is not changing, build the UI once and then use Wrangler's built-in static-asset serving:

npm run ui:build      # generates dist/
wrangler dev          # serves dist/ as static assets on :8787

Open http://localhost:8787 in the browser.

Production Deployment

npm run ui:build orchestrates a 3-step pipeline. Wrangler's [build] config invokes it automatically before every wrangler deploy:

wrangler deploy
# ↳ runs: npm run ui:build
#         1. npm run build:css:prod  → generates public/tailwind.css (minified)
#         2. vite build              → bundles JS/TS modules, extracts CSS → dist/
#         3. npm run ui:copy-static  → copies tailwind.css, shared-styles.css,
#                                      shared-theme.js, compiler-worker.js, docs/ → dist/
# ↳ deploys Worker + static assets from dist/

Note: npm run build:css / npm run build:css:watch are still useful during development when working outside the Vite dev server (e.g. previewing raw HTML files directly in a browser without running npm run ui:dev).

What Was Migrated

BeforeAfter
Chart.js loaded from jsDelivr CDNBundled from chart.js npm package
shared-theme.js (global IIFE)public/js/theme.ts (typed ES module, window.AdblockTheme still available)
validation-ui.js (no exports)validation-ui.js (adds export { ValidationUI })
Empty [build] in wrangler.tomlnpm run ui:build wires Vite into the deploy pipeline
Assets served from ./publicAssets served from ./dist (Vite output)
No Vue/React plugin support@vitejs/plugin-vue, @vitejs/plugin-vue-jsx, @vitejs/plugin-react integrated

Proxy Configuration

The Vite dev server (vite.config.ts) proxies the following paths to the local Worker:

PathTarget
/apihttp://localhost:8787
/compilehttp://localhost:8787
/batchhttp://localhost:8787
/healthhttp://localhost:8787
/ssehttp://localhost:8787
/wsws://localhost:8787 (WebSocket)

Adding a New Page

  1. Create public/your-page.html with a <script type="module" src="/js/your-module.ts"> entry.
  2. Add an entry to rollupOptions.input in vite.config.ts:
    'your-page': resolve(__dirname, 'public/your-page.html'),
    
  3. Create public/js/your-module.ts with the page-specific TypeScript.

Adding a New Shared Module

  1. Create public/js/your-module.ts as a standard ES module.
  2. Import it from any HTML entry point using <script type="module" src="/js/your-module.ts">.
  3. To expose it as a global (for inline <script> compatibility), assign to window:
    window.YourModule = YourModule;
    

Guides

User guides for getting started, migration, troubleshooting, and client libraries.

Contents

Quick Start with Docker

Get the Adblock Compiler up and running in minutes with Docker.

Prerequisites

  • Docker installed on your system
  • Docker Compose (comes with Docker Desktop)

Quick Start

1. Clone the Repository

git clone https://github.com/jaypatrick/adblock-compiler.git
cd adblock-compiler

2. Start with Docker Compose

docker compose up -d

That's it! The compiler is now running.

3. Access the Application

  • Web UI: http://localhost:8787
  • API Documentation: http://localhost:8787/api
  • Test Interface: http://localhost:8787/test.html
  • Metrics: http://localhost:8787/metrics

Example Usage

Using the Web UI

  1. Open http://localhost:8787 in your browser
  2. Switch to "Simple Mode" or "Advanced Mode"
  3. Add filter list URLs or paste a configuration
  4. Click "Compile" and watch the real-time progress
  5. Download or copy the compiled filter list

Using the API

Compile a filter list programmatically:

curl -X POST http://localhost:8787/compile \
  -H "Content-Type: application/json" \
  -d '{
    "configuration": {
      "name": "My Filter List",
      "sources": [
        {
          "source": "https://adguardteam.github.io/AdGuardSDNSFilter/Filters/filter.txt",
          "transformations": ["RemoveComments", "Deduplicate"]
        }
      ],
      "transformations": ["RemoveEmptyLines"]
    }
  }'

Streaming Compilation

Get real-time progress updates using Server-Sent Events:

curl -N -X POST http://localhost:8787/compile/stream \
  -H "Content-Type: application/json" \
  -d '{
    "configuration": {
      "name": "My Filter List",
      "sources": [{"source": "https://example.com/filters.txt"}]
    }
  }'

Managing the Container

View Logs

docker compose logs -f

Stop the Container

docker compose down

Restart the Container

docker compose restart

Update the Container

git pull
docker compose down
docker compose build --no-cache
docker compose up -d

Configuration

Environment Variables

Copy the example environment file and customize:

cp .env.example .env
# Edit .env with your preferred settings

Available variables:

  • COMPILER_VERSION: Version identifier (default: 0.6.0)
  • PORT: Server port (default: 8787)
  • DENO_DIR: Deno cache directory (default: /app/.deno)

Custom Port

To run on a different port, edit docker-compose.yml:

ports:
    - '8080:8787' # Runs on port 8080 instead

Development Mode

For active development with live reload:

# Source code is already mounted in docker-compose.yml
docker compose up

Changes to files in src/, worker/, and public/ will be reflected automatically.

Troubleshooting

Port Already in Use

If port 8787 is already in use:

# Stop the conflicting service or change the port in docker-compose.yml
docker compose down
# Edit docker-compose.yml to use a different port
docker compose up -d

Container Won't Start

Check the logs:

docker compose logs

Permission Issues

If you encounter permission errors with volumes:

sudo chown -R 1001:1001 ./output

Next Steps

Need Help?

  • Issues: https://github.com/jaypatrick/adblock-compiler/issues
  • Documentation: See DOCKER.md and README.md

Client Libraries & Examples

Official and community client libraries for the Adblock Compiler API.

Official Clients

Python

Modern async client using httpx with full type annotations.

from __future__ import annotations

import httpx
from dataclasses import dataclass
from typing import AsyncIterator, Iterator
from collections.abc import Callable

@dataclass
class Source:
    """Filter list source configuration."""
    source: str
    name: str | None = None
    type: str | None = None  # 'adblock' or 'hosts'
    transformations: list[str] | None = None

@dataclass
class CompileResult:
    """Compilation result with metrics."""
    success: bool
    rules: list[str]
    rule_count: int
    cached: bool = False
    metrics: dict | None = None
    error: str | None = None

class AdblockCompilerError(Exception):
    """Raised when compilation fails."""
    pass

class AdblockCompiler:
    """Modern async/sync Python client for Adblock Compiler API."""

    DEFAULT_URL = "https://adblock-compiler.jayson-knight.workers.dev"
    DEFAULT_TRANSFORMS = ["Deduplicate", "RemoveEmptyLines"]

    def __init__(
        self,
        base_url: str = DEFAULT_URL,
        timeout: float = 30.0,
        max_retries: int = 3,
    ) -> None:
        self.base_url = base_url.rstrip("/")
        self.timeout = timeout
        self.max_retries = max_retries

    def _build_payload(
        self,
        sources: list[Source | dict],
        name: str,
        transformations: list[str] | None,
        benchmark: bool,
    ) -> dict:
        source_list = [
            s if isinstance(s, dict) else {
                "source": s.source,
                "name": s.name,
                "type": s.type,
                "transformations": s.transformations,
            }
            for s in sources
        ]
        return {
            "configuration": {
                "name": name,
                "sources": source_list,
                "transformations": transformations or self.DEFAULT_TRANSFORMS,
            },
            "benchmark": benchmark,
        }

    def _parse_result(self, data: dict) -> CompileResult:
        if not data.get("success", False):
            raise AdblockCompilerError(data.get("error", "Unknown error"))
        return CompileResult(
            success=True,
            rules=data.get("rules", []),
            rule_count=data.get("ruleCount", 0),
            cached=data.get("cached", False),
            metrics=data.get("metrics"),
        )

    def compile(
        self,
        sources: list[Source | dict],
        name: str = "Compiled List",
        transformations: list[str] | None = None,
        benchmark: bool = False,
    ) -> CompileResult:
        """Synchronous compilation."""
        payload = self._build_payload(sources, name, transformations, benchmark)

        transport = httpx.HTTPTransport(retries=self.max_retries)
        with httpx.Client(transport=transport, timeout=self.timeout) as client:
            response = client.post(
                f"{self.base_url}/compile",
                json=payload,
                headers={"Content-Type": "application/json"},
            )
            response.raise_for_status()
            return self._parse_result(response.json())

    async def compile_async(
        self,
        sources: list[Source | dict],
        name: str = "Compiled List",
        transformations: list[str] | None = None,
        benchmark: bool = False,
    ) -> CompileResult:
        """Asynchronous compilation."""
        payload = self._build_payload(sources, name, transformations, benchmark)

        transport = httpx.AsyncHTTPTransport(retries=self.max_retries)
        async with httpx.AsyncClient(transport=transport, timeout=self.timeout) as client:
            response = await client.post(
                f"{self.base_url}/compile",
                json=payload,
                headers={"Content-Type": "application/json"},
            )
            response.raise_for_status()
            return self._parse_result(response.json())

    def compile_stream(
        self,
        sources: list[Source | dict],
        name: str = "Compiled List",
        transformations: list[str] | None = None,
        on_event: Callable[[str, dict], None] | None = None,
    ) -> Iterator[tuple[str, dict]]:
        """Stream compilation events using SSE."""
        payload = self._build_payload(sources, name, transformations, benchmark=False)

        with httpx.Client(timeout=None) as client:
            with client.stream(
                "POST",
                f"{self.base_url}/compile/stream",
                json=payload,
                headers={"Content-Type": "application/json"},
            ) as response:
                response.raise_for_status()
                event_type = ""

                for line in response.iter_lines():
                    if line.startswith("event: "):
                        event_type = line[7:]
                    elif line.startswith("data: "):
                        import json
                        data = json.loads(line[6:])
                        if on_event:
                            on_event(event_type, data)
                        yield event_type, data

    async def compile_stream_async(
        self,
        sources: list[Source | dict],
        name: str = "Compiled List",
        transformations: list[str] | None = None,
    ) -> AsyncIterator[tuple[str, dict]]:
        """Async stream compilation events using SSE."""
        payload = self._build_payload(sources, name, transformations, benchmark=False)

        async with httpx.AsyncClient(timeout=None) as client:
            async with client.stream(
                "POST",
                f"{self.base_url}/compile/stream",
                json=payload,
                headers={"Content-Type": "application/json"},
            ) as response:
                response.raise_for_status()
                event_type = ""

                async for line in response.aiter_lines():
                    if line.startswith("event: "):
                        event_type = line[7:]
                    elif line.startswith("data: "):
                        import json
                        data = json.loads(line[6:])
                        yield event_type, data


# Example usage
if __name__ == "__main__":
    import asyncio

    client = AdblockCompiler()

    # Synchronous compilation
    result = client.compile(
        sources=[Source(source="https://easylist.to/easylist/easylist.txt")],
        name="My Filter List",
        benchmark=True,
    )
    print(f"Compiled {result.rule_count} rules")
    if result.metrics:
        print(f"Duration: {result.metrics['totalDurationMs']}ms")

    # Async compilation
    async def main():
        result = await client.compile_async(
            sources=[{"source": "https://easylist.to/easylist/easylist.txt"}],
            benchmark=True,
        )
        print(f"Async compiled {result.rule_count} rules")

        # Async streaming
        async for event_type, data in client.compile_stream_async(
            sources=[{"source": "https://easylist.to/easylist/easylist.txt"}],
        ):
            if event_type == "progress":
                print(f"Progress: {data.get('message')}")
            elif event_type == "result":
                print(f"Complete! {data['ruleCount']} rules")

    asyncio.run(main())

JavaScript/TypeScript

Modern TypeScript client with retry logic, AbortController support, and custom error handling.

// Types
interface Source {
    source: string;
    name?: string;
    type?: 'adblock' | 'hosts';
    transformations?: string[];
}

interface CompileOptions {
    name?: string;
    transformations?: string[];
    benchmark?: boolean;
    signal?: AbortSignal;
}

interface CompileResult {
    success: boolean;
    rules: string[];
    ruleCount: number;
    cached: boolean;
    metrics?: {
        totalDurationMs: number;
        sourceCount: number;
        ruleCount: number;
    };
}

interface StreamEvent {
    event: 'progress' | 'result' | 'error';
    data: Record<string, unknown>;
}

// Custom errors
class AdblockCompilerError extends Error {
    constructor(
        message: string,
        public readonly statusCode?: number,
        public readonly retryAfter?: number,
    ) {
        super(message);
        this.name = 'AdblockCompilerError';
    }
}

class RateLimitError extends AdblockCompilerError {
    constructor(retryAfter: number) {
        super(`Rate limited. Retry after ${retryAfter}s`, 429, retryAfter);
        this.name = 'RateLimitError';
    }
}

// Client
class AdblockCompiler {
    private readonly baseUrl: string;
    private readonly maxRetries: number;
    private readonly retryDelayMs: number;

    static readonly DEFAULT_URL = 'https://adblock-compiler.jayson-knight.workers.dev';
    static readonly DEFAULT_TRANSFORMS = ['Deduplicate', 'RemoveEmptyLines'];

    constructor(options: {
        baseUrl?: string;
        maxRetries?: number;
        retryDelayMs?: number;
    } = {}) {
        this.baseUrl = options.baseUrl?.replace(/\/$/, '') ?? AdblockCompiler.DEFAULT_URL;
        this.maxRetries = options.maxRetries ?? 3;
        this.retryDelayMs = options.retryDelayMs ?? 1000;
    }

    private async fetchWithRetry(
        url: string,
        init: RequestInit,
        retries = this.maxRetries,
    ): Promise<Response> {
        let lastError: Error | undefined;

        for (let attempt = 0; attempt <= retries; attempt++) {
            try {
                const response = await fetch(url, init);

                if (response.status === 429) {
                    const retryAfter = parseInt(response.headers.get('Retry-After') ?? '60', 10);
                    throw new RateLimitError(retryAfter);
                }

                if (!response.ok) {
                    throw new AdblockCompilerError(
                        `HTTP ${response.status}: ${response.statusText}`,
                        response.status,
                    );
                }

                return response;
            } catch (error) {
                lastError = error as Error;

                // Don't retry on rate limits or abort
                if (error instanceof RateLimitError) throw error;
                if (init.signal?.aborted) throw error;

                // Retry on network errors
                if (attempt < retries) {
                    await new Promise(r => setTimeout(r, this.retryDelayMs * (attempt + 1)));
                }
            }
        }

        throw lastError;
    }

    async compile(sources: Source[], options: CompileOptions = {}): Promise<CompileResult> {
        const payload = {
            configuration: {
                name: options.name ?? 'Compiled List',
                sources,
                transformations: options.transformations ?? AdblockCompiler.DEFAULT_TRANSFORMS,
            },
            benchmark: options.benchmark ?? false,
        };

        const response = await this.fetchWithRetry(
            `${this.baseUrl}/compile`,
            {
                method: 'POST',
                headers: { 'Content-Type': 'application/json' },
                body: JSON.stringify(payload),
                signal: options.signal,
            },
        );

        const result = await response.json();

        if (!result.success) {
            throw new AdblockCompilerError(`Compilation failed: ${result.error}`);
        }

        return result;
    }

    async *compileStream(
        sources: Source[],
        options: Omit<CompileOptions, 'benchmark'> = {},
    ): AsyncGenerator<StreamEvent> {
        const payload = {
            configuration: {
                name: options.name ?? 'Compiled List',
                sources,
                transformations: options.transformations ?? AdblockCompiler.DEFAULT_TRANSFORMS,
            },
        };

        const response = await this.fetchWithRetry(
            `${this.baseUrl}/compile/stream`,
            {
                method: 'POST',
                headers: { 'Content-Type': 'application/json' },
                body: JSON.stringify(payload),
                signal: options.signal,
            },
        );

        const reader = response.body!.getReader();
        const decoder = new TextDecoder();
        let buffer = '';
        let currentEvent = '';

        try {
            while (true) {
                const { done, value } = await reader.read();
                if (done) break;

                buffer += decoder.decode(value, { stream: true });
                const lines = buffer.split('\n');
                buffer = lines.pop() ?? '';

                for (const line of lines) {
                    if (line.startsWith('event: ')) {
                        currentEvent = line.slice(7);
                    } else if (line.startsWith('data: ')) {
                        yield {
                            event: currentEvent as StreamEvent['event'],
                            data: JSON.parse(line.slice(6)),
                        };
                    }
                }
            }
        } finally {
            reader.releaseLock();
        }
    }
}

// Example usage
const client = new AdblockCompiler({ maxRetries: 3 });

// With AbortController for cancellation
const controller = new AbortController();
setTimeout(() => controller.abort(), 30000); // 30s timeout

try {
    const result = await client.compile(
        [{ source: 'https://easylist.to/easylist/easylist.txt' }],
        {
            name: 'My Filter List',
            benchmark: true,
            signal: controller.signal,
        },
    );

    console.log(`Compiled ${result.ruleCount} rules`);
    console.log(`Duration: ${result.metrics?.totalDurationMs}ms`);
    console.log(`Cached: ${result.cached}`);
} catch (error) {
    if (error instanceof RateLimitError) {
        console.log(`Rate limited. Retry after ${error.retryAfter}s`);
    } else {
        throw error;
    }
}

// Streaming with progress updates
for await (const { event, data } of client.compileStream([
    { source: 'https://easylist.to/easylist/easylist.txt' },
])) {
    switch (event) {
        case 'progress':
            console.log(`Progress: ${data.message}`);
            break;
        case 'result':
            console.log(`Complete! ${data.ruleCount} rules`);
            break;
        case 'error':
            console.error(`Error: ${data.message}`);
            break;
    }
}

Go

Modern Go client with context support, retry logic, and proper error handling.

package adblock

import (
	"bufio"
	"bytes"
	"context"
	"encoding/json"
	"errors"
	"fmt"
	"net/http"
	"strconv"
	"strings"
	"time"
)

const (
	DefaultBaseURL    = "https://adblock-compiler.jayson-knight.workers.dev"
	DefaultTimeout    = 30 * time.Second
	DefaultMaxRetries = 3
)

var (
	ErrRateLimited      = errors.New("rate limited")
	ErrCompilationFailed = errors.New("compilation failed")
)

// Source represents a filter list source.
type Source struct {
	Source          string   `json:"source"`
	Name            string   `json:"name,omitempty"`
	Type            string   `json:"type,omitempty"`
	Transformations []string `json:"transformations,omitempty"`
}

// Metrics contains compilation performance metrics.
type Metrics struct {
	TotalDurationMs int `json:"totalDurationMs"`
	SourceCount     int `json:"sourceCount"`
	RuleCount       int `json:"ruleCount"`
}

// CompileResult represents the compilation response.
type CompileResult struct {
	Success   bool     `json:"success"`
	Rules     []string `json:"rules"`
	RuleCount int      `json:"ruleCount"`
	Cached    bool     `json:"cached"`
	Metrics   *Metrics `json:"metrics,omitempty"`
	Error     string   `json:"error,omitempty"`
}

// Event represents a Server-Sent Event from streaming compilation.
type Event struct {
	Type string
	Data map[string]any
}

// CompileOptions configures a compilation request.
type CompileOptions struct {
	Name            string
	Transformations []string
	Benchmark       bool
}

// Compiler is the Adblock Compiler API client.
type Compiler struct {
	baseURL    string
	client     *http.Client
	maxRetries int
}

// Option configures a Compiler.
type Option func(*Compiler)

// WithBaseURL sets a custom API base URL.
func WithBaseURL(url string) Option {
	return func(c *Compiler) { c.baseURL = strings.TrimRight(url, "/") }
}

// WithTimeout sets the HTTP client timeout.
func WithTimeout(d time.Duration) Option {
	return func(c *Compiler) { c.client.Timeout = d }
}

// WithMaxRetries sets the maximum retry attempts.
func WithMaxRetries(n int) Option {
	return func(c *Compiler) { c.maxRetries = n }
}

// NewCompiler creates a new Adblock Compiler client.
func NewCompiler(opts ...Option) *Compiler {
	c := &Compiler{
		baseURL:    DefaultBaseURL,
		client:     &http.Client{Timeout: DefaultTimeout},
		maxRetries: DefaultMaxRetries,
	}
	for _, opt := range opts {
		opt(c)
	}
	return c
}

func (c *Compiler) doWithRetry(ctx context.Context, req *http.Request) (*http.Response, error) {
	var lastErr error

	for attempt := 0; attempt <= c.maxRetries; attempt++ {
		if attempt > 0 {
			select {
			case <-ctx.Done():
				return nil, ctx.Err()
			case <-time.After(time.Duration(attempt) * time.Second):
			}
		}

		resp, err := c.client.Do(req.WithContext(ctx))
		if err != nil {
			lastErr = err
			continue
		}

		if resp.StatusCode == http.StatusTooManyRequests {
			resp.Body.Close()
			retryAfter, _ := strconv.Atoi(resp.Header.Get("Retry-After"))
			lastErr = fmt.Errorf("%w: retry after %ds", ErrRateLimited, retryAfter)
			continue
		}

		if resp.StatusCode >= 500 {
			resp.Body.Close()
			lastErr = fmt.Errorf("server error: %s", resp.Status)
			continue
		}

		return resp, nil
	}

	return nil, lastErr
}

// Compile compiles filter lists and returns the result.
func (c *Compiler) Compile(ctx context.Context, sources []Source, opts *CompileOptions) (*CompileResult, error) {
	if opts == nil {
		opts = &CompileOptions{}
	}
	if opts.Name == "" {
		opts.Name = "Compiled List"
	}
	if opts.Transformations == nil {
		opts.Transformations = []string{"Deduplicate", "RemoveEmptyLines"}
	}

	payload := map[string]any{
		"configuration": map[string]any{
			"name":            opts.Name,
			"sources":         sources,
			"transformations": opts.Transformations,
		},
		"benchmark": opts.Benchmark,
	}

	body, err := json.Marshal(payload)
	if err != nil {
		return nil, fmt.Errorf("marshal request: %w", err)
	}

	req, err := http.NewRequest(http.MethodPost, c.baseURL+"/compile", bytes.NewReader(body))
	if err != nil {
		return nil, fmt.Errorf("create request: %w", err)
	}
	req.Header.Set("Content-Type", "application/json")

	resp, err := c.doWithRetry(ctx, req)
	if err != nil {
		return nil, err
	}
	defer resp.Body.Close()

	if resp.StatusCode != http.StatusOK {
		return nil, fmt.Errorf("unexpected status: %s", resp.Status)
	}

	var result CompileResult
	if err := json.NewDecoder(resp.Body).Decode(&result); err != nil {
		return nil, fmt.Errorf("decode response: %w", err)
	}

	if !result.Success {
		return nil, fmt.Errorf("%w: %s", ErrCompilationFailed, result.Error)
	}

	return &result, nil
}

// CompileStream compiles filter lists and streams events via a channel.
// The returned channel is closed when the stream ends or context is canceled.
func (c *Compiler) CompileStream(ctx context.Context, sources []Source, opts *CompileOptions) (<-chan Event, <-chan error) {
	events := make(chan Event)
	errc := make(chan error, 1)

	go func() {
		defer close(events)
		defer close(errc)

		if opts == nil {
			opts = &CompileOptions{}
		}
		if opts.Name == "" {
			opts.Name = "Compiled List"
		}
		if opts.Transformations == nil {
			opts.Transformations = []string{"Deduplicate", "RemoveEmptyLines"}
		}

		payload := map[string]any{
			"configuration": map[string]any{
				"name":            opts.Name,
				"sources":         sources,
				"transformations": opts.Transformations,
			},
		}

		body, err := json.Marshal(payload)
		if err != nil {
			errc <- fmt.Errorf("marshal request: %w", err)
			return
		}

		req, err := http.NewRequest(http.MethodPost, c.baseURL+"/compile/stream", bytes.NewReader(body))
		if err != nil {
			errc <- fmt.Errorf("create request: %w", err)
			return
		}
		req.Header.Set("Content-Type", "application/json")

		resp, err := c.client.Do(req.WithContext(ctx))
		if err != nil {
			errc <- err
			return
		}
		defer resp.Body.Close()

		if resp.StatusCode != http.StatusOK {
			errc <- fmt.Errorf("unexpected status: %s", resp.Status)
			return
		}

		scanner := bufio.NewScanner(resp.Body)
		var eventType string

		for scanner.Scan() {
			select {
			case <-ctx.Done():
				errc <- ctx.Err()
				return
			default:
			}

			line := scanner.Text()
			switch {
			case strings.HasPrefix(line, "event: "):
				eventType = strings.TrimPrefix(line, "event: ")
			case strings.HasPrefix(line, "data: "):
				var data map[string]any
				if err := json.Unmarshal([]byte(strings.TrimPrefix(line, "data: ")), &data); err == nil {
					events <- Event{Type: eventType, Data: data}
				}
			}
		}

		if err := scanner.Err(); err != nil {
			errc <- err
		}
	}()

	return events, errc
}

// Example usage
func main() {
	ctx, cancel := context.WithTimeout(context.Background(), 60*time.Second)
	defer cancel()

	client := NewCompiler(
		WithMaxRetries(3),
		WithTimeout(30*time.Second),
	)

	// Simple compilation
	result, err := client.Compile(ctx, []Source{
		{Source: "https://easylist.to/easylist/easylist.txt"},
	}, &CompileOptions{
		Name:      "My Filter List",
		Benchmark: true,
	})
	if err != nil {
		if errors.Is(err, ErrRateLimited) {
			fmt.Println("Rate limited, try again later")
			return
		}
		panic(err)
	}

	fmt.Printf("Compiled %d rules", result.RuleCount)
	if result.Metrics != nil {
		fmt.Printf(" in %dms", result.Metrics.TotalDurationMs)
	}
	fmt.Printf(" (cached: %v)\n", result.Cached)

	// Streaming compilation
	events, errc := client.CompileStream(ctx, []Source{
		{Source: "https://easylist.to/easylist/easylist.txt"},
	}, nil)

	for event := range events {
		switch event.Type {
		case "progress":
			fmt.Printf("Progress: %v\n", event.Data["message"])
		case "result":
			fmt.Printf("Complete! %v rules\n", event.Data["ruleCount"])
		case "error":
			fmt.Printf("Error: %v\n", event.Data["message"])
		}
	}

	if err := <-errc; err != nil {
		fmt.Printf("Stream error: %v\n", err)
	}
}

Rust

Async Rust client using reqwest and tokio.

use reqwest::{Client, StatusCode};
use serde::{Deserialize, Serialize};
use std::time::Duration;
use thiserror::Error;

const DEFAULT_BASE_URL: &str = "https://adblock-compiler.jayson-knight.workers.dev";

#[derive(Error, Debug)]
pub enum AdblockError {
    #[error("HTTP error: {0}")]
    Http(#[from] reqwest::Error),
    #[error("Rate limited, retry after {0}s")]
    RateLimited(u64),
    #[error("Compilation failed: {0}")]
    CompilationFailed(String),
    #[error("Parse error: {0}")]
    Parse(#[from] serde_json::Error),
}

#[derive(Debug, Clone, Serialize)]
pub struct Source {
    pub source: String,
    #[serde(skip_serializing_if = "Option::is_none")]
    pub name: Option<String>,
    #[serde(skip_serializing_if = "Option::is_none")]
    pub r#type: Option<String>,
    #[serde(skip_serializing_if = "Option::is_none")]
    pub transformations: Option<Vec<String>>,
}

impl Source {
    pub fn new(source: impl Into<String>) -> Self {
        Self {
            source: source.into(),
            name: None,
            r#type: None,
            transformations: None,
        }
    }
}

#[derive(Debug, Clone, Deserialize)]
#[serde(rename_all = "camelCase")]
pub struct Metrics {
    pub total_duration_ms: u64,
    pub source_count: usize,
    pub rule_count: usize,
}

#[derive(Debug, Clone, Deserialize)]
#[serde(rename_all = "camelCase")]
pub struct CompileResult {
    pub success: bool,
    pub rules: Vec<String>,
    pub rule_count: usize,
    #[serde(default)]
    pub cached: bool,
    pub metrics: Option<Metrics>,
    pub error: Option<String>,
}

#[derive(Debug, Clone, Serialize)]
struct CompileRequest {
    configuration: Configuration,
    benchmark: bool,
}

#[derive(Debug, Clone, Serialize)]
struct Configuration {
    name: String,
    sources: Vec<Source>,
    transformations: Vec<String>,
}

pub struct AdblockCompiler {
    client: Client,
    base_url: String,
    max_retries: u32,
}

impl Default for AdblockCompiler {
    fn default() -> Self {
        Self::new()
    }
}

impl AdblockCompiler {
    pub fn new() -> Self {
        Self {
            client: Client::builder()
                .timeout(Duration::from_secs(30))
                .build()
                .expect("Failed to create HTTP client"),
            base_url: DEFAULT_BASE_URL.to_string(),
            max_retries: 3,
        }
    }

    pub fn with_base_url(mut self, url: impl Into<String>) -> Self {
        self.base_url = url.into().trim_end_matches('/').to_string();
        self
    }

    pub fn with_timeout(mut self, timeout: Duration) -> Self {
        self.client = Client::builder()
            .timeout(timeout)
            .build()
            .expect("Failed to create HTTP client");
        self
    }

    pub fn with_max_retries(mut self, retries: u32) -> Self {
        self.max_retries = retries;
        self
    }

    pub async fn compile(
        &self,
        sources: Vec<Source>,
        name: Option<&str>,
        transformations: Option<Vec<String>>,
        benchmark: bool,
    ) -> Result<CompileResult, AdblockError> {
        let request = CompileRequest {
            configuration: Configuration {
                name: name.unwrap_or("Compiled List").to_string(),
                sources,
                transformations: transformations
                    .unwrap_or_else(|| vec!["Deduplicate".into(), "RemoveEmptyLines".into()]),
            },
            benchmark,
        };

        let mut last_error = None;

        for attempt in 0..=self.max_retries {
            if attempt > 0 {
                tokio::time::sleep(Duration::from_secs(attempt as u64)).await;
            }

            let response = match self
                .client
                .post(format!("{}/compile", self.base_url))
                .json(&request)
                .send()
                .await
            {
                Ok(resp) => resp,
                Err(e) => {
                    last_error = Some(AdblockError::Http(e));
                    continue;
                }
            };

            match response.status() {
                StatusCode::TOO_MANY_REQUESTS => {
                    let retry_after = response
                        .headers()
                        .get("Retry-After")
                        .and_then(|v| v.to_str().ok())
                        .and_then(|v| v.parse().ok())
                        .unwrap_or(60);
                    last_error = Some(AdblockError::RateLimited(retry_after));
                    continue;
                }
                status if status.is_server_error() => {
                    last_error = Some(AdblockError::CompilationFailed(format!(
                        "Server error: {}",
                        status
                    )));
                    continue;
                }
                _ => {}
            }

            let result: CompileResult = response.json().await?;

            if !result.success {
                return Err(AdblockError::CompilationFailed(
                    result.error.unwrap_or_else(|| "Unknown error".to_string()),
                ));
            }

            return Ok(result);
        }

        Err(last_error.unwrap_or_else(|| AdblockError::CompilationFailed("Max retries exceeded".to_string())))
    }
}

// Example usage
#[tokio::main]
async fn main() -> Result<(), AdblockError> {
    let client = AdblockCompiler::new()
        .with_max_retries(3)
        .with_timeout(Duration::from_secs(60));

    let result = client
        .compile(
            vec![Source::new("https://easylist.to/easylist/easylist.txt")],
            Some("My Filter List"),
            None,
            true,
        )
        .await?;

    println!("Compiled {} rules", result.rule_count);
    if let Some(metrics) = &result.metrics {
        println!("Duration: {}ms", metrics.total_duration_ms);
    }
    println!("Cached: {}", result.cached);

    Ok(())
}

C# / .NET

Modern C# client using HttpClient and async/await patterns.

using System.Net;
using System.Net.Http.Json;
using System.Runtime.CompilerServices;
using System.Text.Json;
using System.Text.Json.Serialization;

namespace AdblockCompiler;

public record Source(
    [property: JsonPropertyName("source")] string Url,
    [property: JsonPropertyName("name")] string? Name = null,
    [property: JsonPropertyName("type")] string? Type = null,
    [property: JsonPropertyName("transformations")] List<string>? Transformations = null
);

public record Metrics(
    [property: JsonPropertyName("totalDurationMs")] int TotalDurationMs,
    [property: JsonPropertyName("sourceCount")] int SourceCount,
    [property: JsonPropertyName("ruleCount")] int RuleCount
);

public record CompileResult(
    [property: JsonPropertyName("success")] bool Success,
    [property: JsonPropertyName("rules")] List<string> Rules,
    [property: JsonPropertyName("ruleCount")] int RuleCount,
    [property: JsonPropertyName("cached")] bool Cached = false,
    [property: JsonPropertyName("metrics")] Metrics? Metrics = null,
    [property: JsonPropertyName("error")] string? Error = null
);

public record StreamEvent(string EventType, JsonElement Data);

public class AdblockCompilerException : Exception
{
    public HttpStatusCode? StatusCode { get; }
    public int? RetryAfter { get; }

    public AdblockCompilerException(string message, HttpStatusCode? statusCode = null, int? retryAfter = null)
        : base(message)
    {
        StatusCode = statusCode;
        RetryAfter = retryAfter;
    }
}

public class RateLimitException : AdblockCompilerException
{
    public RateLimitException(int retryAfter)
        : base($"Rate limited. Retry after {retryAfter}s", HttpStatusCode.TooManyRequests, retryAfter) { }
}

public sealed class AdblockCompilerClient : IDisposable
{
    private const string DefaultBaseUrl = "https://adblock-compiler.jayson-knight.workers.dev";
    private static readonly string[] DefaultTransformations = ["Deduplicate", "RemoveEmptyLines"];

    private readonly HttpClient _httpClient;
    private readonly string _baseUrl;
    private readonly int _maxRetries;

    public AdblockCompilerClient(
        string? baseUrl = null,
        TimeSpan? timeout = null,
        int maxRetries = 3)
    {
        _baseUrl = (baseUrl ?? DefaultBaseUrl).TrimEnd('/');
        _maxRetries = maxRetries;
        _httpClient = new HttpClient { Timeout = timeout ?? TimeSpan.FromSeconds(30) };
    }

    public async Task<CompileResult> CompileAsync(
        IEnumerable<Source> sources,
        string? name = null,
        IEnumerable<string>? transformations = null,
        bool benchmark = false,
        CancellationToken cancellationToken = default)
    {
        var request = new
        {
            configuration = new
            {
                name = name ?? "Compiled List",
                sources = sources.ToList(),
                transformations = transformations?.ToList() ?? DefaultTransformations.ToList()
            },
            benchmark
        };

        Exception? lastException = null;

        for (var attempt = 0; attempt <= _maxRetries; attempt++)
        {
            if (attempt > 0)
            {
                await Task.Delay(TimeSpan.FromSeconds(attempt), cancellationToken);
            }

            try
            {
                var response = await _httpClient.PostAsJsonAsync(
                    $"{_baseUrl}/compile",
                    request,
                    cancellationToken);

                if (response.StatusCode == HttpStatusCode.TooManyRequests)
                {
                    var retryAfter = int.TryParse(
                        response.Headers.GetValues("Retry-After").FirstOrDefault(),
                        out var ra) ? ra : 60;
                    throw new RateLimitException(retryAfter);
                }

                response.EnsureSuccessStatusCode();

                var result = await response.Content.ReadFromJsonAsync<CompileResult>(cancellationToken)
                    ?? throw new AdblockCompilerException("Failed to deserialize response");

                if (!result.Success)
                {
                    throw new AdblockCompilerException($"Compilation failed: {result.Error}");
                }

                return result;
            }
            catch (RateLimitException)
            {
                throw;
            }
            catch (OperationCanceledException)
            {
                throw;
            }
            catch (Exception ex)
            {
                lastException = ex;
            }
        }

        throw lastException ?? new AdblockCompilerException("Max retries exceeded");
    }

    public async IAsyncEnumerable<StreamEvent> CompileStreamAsync(
        IEnumerable<Source> sources,
        string? name = null,
        IEnumerable<string>? transformations = null,
        [EnumeratorCancellation] CancellationToken cancellationToken = default)
    {
        var request = new
        {
            configuration = new
            {
                name = name ?? "Compiled List",
                sources = sources.ToList(),
                transformations = transformations?.ToList() ?? DefaultTransformations.ToList()
            }
        };

        var response = await _httpClient.PostAsJsonAsync(
            $"{_baseUrl}/compile/stream",
            request,
            cancellationToken);

        response.EnsureSuccessStatusCode();

        await using var stream = await response.Content.ReadAsStreamAsync(cancellationToken);
        using var reader = new StreamReader(stream);

        var currentEvent = "";

        while (!reader.EndOfStream)
        {
            cancellationToken.ThrowIfCancellationRequested();

            var line = await reader.ReadLineAsync(cancellationToken);
            if (string.IsNullOrEmpty(line)) continue;

            if (line.StartsWith("event: "))
            {
                currentEvent = line[7..];
            }
            else if (line.StartsWith("data: "))
            {
                var data = JsonSerializer.Deserialize<JsonElement>(line[6..]);
                yield return new StreamEvent(currentEvent, data);
            }
        }
    }

    public void Dispose() => _httpClient.Dispose();
}

// Example usage
public static class Program
{
    public static async Task Main()
    {
        using var client = new AdblockCompilerClient(
            timeout: TimeSpan.FromSeconds(60),
            maxRetries: 3);

        try
        {
            // Simple compilation
            var result = await client.CompileAsync(
                sources: [new Source("https://easylist.to/easylist/easylist.txt")],
                name: "My Filter List",
                benchmark: true);

            Console.WriteLine($"Compiled {result.RuleCount} rules");
            if (result.Metrics is not null)
            {
                Console.WriteLine($"Duration: {result.Metrics.TotalDurationMs}ms");
            }
            Console.WriteLine($"Cached: {result.Cached}");

            // Streaming compilation
            await foreach (var evt in client.CompileStreamAsync(
                sources: [new Source("https://easylist.to/easylist/easylist.txt")]))
            {
                switch (evt.EventType)
                {
                    case "progress":
                        Console.WriteLine($"Progress: {evt.Data.GetProperty("message")}");
                        break;
                    case "result":
                        Console.WriteLine($"Complete! {evt.Data.GetProperty("ruleCount")} rules");
                        break;
                    case "error":
                        Console.WriteLine($"Error: {evt.Data.GetProperty("message")}");
                        break;
                }
            }
        }
        catch (RateLimitException ex)
        {
            Console.WriteLine($"Rate limited. Retry after {ex.RetryAfter}s");
        }
    }
}

Community Clients

Contributions welcome for additional language support:

  • Ruby
  • PHP
  • Java
  • Swift
  • Kotlin

Installation

Python

pip install httpx  # Modern async HTTP client
# Save the client code as adblock_compiler.py

JavaScript/TypeScript

# No dependencies required - uses native fetch
# Works in Node.js 18+, Deno, Bun, and all modern browsers

Go

go get  # No external dependencies - uses standard library
# Save as adblock/compiler.go

Rust

# Add to Cargo.toml
[dependencies]
reqwest = { version = "0.12", features = ["json"] }
serde = { version = "1.0", features = ["derive"] }
serde_json = "1.0"
thiserror = "2.0"
tokio = { version = "1", features = ["full"] }

C# / .NET

# .NET 8+ required (uses native JSON and HTTP support)
dotnet new console
# No additional packages needed

Error Handling

All clients handle the following errors:

  • 429 Too Many Requests: Rate limit exceeded (max 10 req/min)
  • 400 Bad Request: Invalid configuration
  • 500 Internal Server Error: Compilation failed

Caching

The API automatically caches compilation results for 1 hour. Check the X-Cache header:

  • HIT: Result served from cache
  • MISS: Fresh compilation

Rate Limiting

  • Limit: 10 requests per minute per IP
  • Window: 60 seconds (sliding)
  • Response: HTTP 429 with Retry-After header

Support

Migration Guide

Migrating from @adguard/hostlist-compiler to AdBlock Compiler.

Overview

AdBlock Compiler is a drop-in replacement for @adguard/hostlist-compiler with the same API surface and enhanced features. The migration process is straightforward and requires minimal code changes.

Why Migrate?

  • Same API - No breaking changes to core functionality
  • Better Performance - Gzip compression, request deduplication, smart caching
  • Production Ready - Circuit breaker, rate limiting, error handling
  • Modern Stack - Deno-native, zero Node.js dependencies
  • Cloudflare Workers - Deploy as serverless functions
  • Real-time Progress - Server-Sent Events for compilation tracking
  • Visual Diff - See changes between compilations
  • Batch Processing - Compile multiple lists in parallel

Quick Migration

1. Update Package Reference

npm/Node.js:

{
    "dependencies": {
        "@adguard/hostlist-compiler": "^1.0.39", // OLD
        "@jk-com/adblock-compiler": "^0.6.0" // NEW
    }
}

Deno:

// OLD
import { compile } from 'npm:@adguard/hostlist-compiler@^1.0.39';

// NEW
import { compile } from 'jsr:@jk-com/adblock-compiler@^0.6.0';

2. Update Imports

Replace all import statements:

// OLD
import { compile, FilterCompiler } from '@adguard/hostlist-compiler';

// NEW
import { compile, FilterCompiler } from '@jk-com/adblock-compiler';

That's it! Your code should work without any other changes.

API Compatibility

Core Functions

All core functions remain unchanged:

// compile() - SAME API
const rules = await compile(configuration);

// FilterCompiler class - SAME API
const compiler = new FilterCompiler();
const result = await compiler.compile(configuration);

Configuration Schema

The configuration schema is 100% compatible:

interface IConfiguration {
    name: string;
    description?: string;
    homepage?: string;
    license?: string;
    version?: string;
    sources: ISource[];
    transformations?: TransformationType[];
    exclusions?: string[];
    exclusions_sources?: string[];
    inclusions?: string[];
    inclusions_sources?: string[];
}

Transformations

All 11 transformations are supported with identical behavior:

  1. ConvertToAscii
  2. TrimLines
  3. RemoveComments
  4. Compress
  5. RemoveModifiers
  6. InvertAllow
  7. Validate
  8. ValidateAllowIp
  9. Deduplicate
  10. RemoveEmptyLines
  11. InsertFinalNewLine

New Features (Optional)

After migrating, you can optionally use new features:

Server-Sent Events

import { WorkerCompiler } from '@jk-com/adblock-compiler';

const compiler = new WorkerCompiler({
    events: {
        onSourceStart: (event) => console.log('Fetching:', event.source.name),
        onProgress: (event) => console.log(`${event.current}/${event.total}`),
        onCompilationComplete: (event) => console.log('Done!', event.ruleCount),
    },
});

await compiler.compileWithMetrics(configuration, true);

Batch Compilation API

// Using the deployed API
const response = await fetch('https://adblock-compiler.jayson-knight.workers.dev/compile/batch', {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({
        requests: [
            { id: 'list-1', configuration: config1 },
            { id: 'list-2', configuration: config2 },
        ],
    }),
});

const { results } = await response.json();

Visual Diff

Use the Web UI at https://adblock-compiler.jayson-knight.workers.dev/ to see visual diffs between compilations.

Platform-Specific Migration

Node.js Projects

Before:

const { compile } = require('@adguard/hostlist-compiler');

After:

// Install via npm
npm install @jk-com/adblock-compiler

// Use the package
const { compile } = require('@jk-com/adblock-compiler');

Deno Projects

Before:

import { compile } from 'npm:@adguard/hostlist-compiler';

After:

// Preferred: Use JSR
import { compile } from 'jsr:@jk-com/adblock-compiler';

// Or via npm compatibility
import { compile } from 'npm:@jk-com/adblock-compiler';

TypeScript Projects

Before:

import { compile, IConfiguration } from '@adguard/hostlist-compiler';

After:

import { compile, IConfiguration } from '@jk-com/adblock-compiler';

Types are included—no need for separate @types packages.

Breaking Changes

None! ✨

AdBlock Compiler maintains 100% API compatibility with @adguard/hostlist-compiler. All existing code should work without modifications.

Behavioral Differences

The following improvements are automatic (no code changes needed):

  1. Error Messages - More detailed error messages with error codes
  2. Performance - Faster compilation with parallel source processing
  3. Validation - Enhanced validation with better error reporting
  4. Caching - Automatic caching when deployed as Cloudflare Worker

Testing Your Migration

1. Update Dependencies

# npm
npm uninstall @adguard/hostlist-compiler
npm install @jk-com/adblock-compiler

# Deno
# Just update your import URLs

2. Run Your Tests

npm test
# or
deno test

3. Verify Output

Compile a test filter list and verify the output:

# Should produce identical results
diff old-output.txt new-output.txt

Rollback Plan

If you need to rollback:

# npm
npm uninstall @jk-com/adblock-compiler
npm install @adguard/hostlist-compiler@^1.0.39

# Deno - just revert your imports

Support & Resources

  • Documentation: docs/api/README.md
  • Web UI: https://adblock-compiler.jayson-knight.workers.dev/
  • API Reference: https://adblock-compiler.jayson-knight.workers.dev/api
  • GitHub Issues: https://github.com/jaypatrick/adblock-compiler/issues
  • Examples: docs/guides/clients.md

Common Issues

Issue: Package not found

error: JSR package not found: @jk-com/adblock-compiler

Solution: The package needs to be published to JSR first. Use npm import as fallback:

import { compile } from 'npm:@jk-com/adblock-compiler';

Issue: Type errors

Type 'SourceType' is not assignable to type 'SourceType'

Solution: Clear your TypeScript cache and rebuild:

# Deno
rm -rf ~/.cache/deno

# Node
rm -rf node_modules && npm install

Issue: Different output

If the compiled output differs significantly, please file an issue with:

  1. Your configuration file
  2. Expected output vs actual output
  3. Version numbers of both packages

FAQ

Q: Will this break my existing code?

A: No. AdBlock Compiler is designed as a drop-in replacement with 100% API compatibility.

Q: Do I need to change my configuration files?

A: No. All configuration files (JSON, YAML, TOML) work identically.

Q: Can I use both packages simultaneously?

A: Yes, but not recommended. The packages have the same exports and will conflict.

Q: What about performance?

A: AdBlock Compiler is generally faster due to better parallelization and Deno's optimizations.

Q: Is there a migration tool?

A: Not needed! Just update your import statements and you're done.

Q: What if I find a bug?

A: Report it at https://github.com/jaypatrick/adblock-compiler/issues

Success Stories

After migrating, users typically see:

  • 30-50% faster compilation times
  • 📉 70-80% reduced cache storage usage
  • 🔄 Zero downtime during migration
  • 100% test pass rate after migration

Next Steps

  1. ✅ Update package dependencies
  2. ✅ Update import statements
  3. ✅ Run tests
  4. ✅ Deploy with confidence!
  5. 🎉 Enjoy new features (SSE, batch API, visual diff)

Need help? Open an issue or check the documentation!

Troubleshooting Guide

Common issues and solutions for AdBlock Compiler.

Table of Contents

Installation Issues

Package not found on JSR

Error:

error: JSR package not found: @jk-com/adblock-compiler

Solution: Use npm import as fallback:

import { compile } from 'npm:@jk-com/adblock-compiler';

Or install via npm:

npm install @jk-com/adblock-compiler

Deno version incompatibility

Error:

error: Unsupported Deno version

Solution: AdBlock Compiler requires Deno 2.0 or higher:

deno upgrade
deno --version  # Should be 2.0.0 or higher

Permission denied errors

Error:

error: Requires net access to "example.com"

Solution: Grant necessary permissions:

# Allow all network access
deno run --allow-net your-script.ts

# Allow specific hosts
deno run --allow-net=example.com,github.com your-script.ts

# For file access
deno run --allow-read --allow-net your-script.ts

Compilation Errors

Invalid configuration

Error:

ValidationError: Invalid configuration: sources is required

Solution: Ensure your configuration has required fields:

const config: IConfiguration = {
    name: 'My Filter List', // REQUIRED
    sources: [ // REQUIRED
        {
            name: 'Source 1',
            source: 'https://example.com/list.txt',
        },
    ],
    // Optional fields...
};

Source fetch failures

Error:

Error fetching source: 404 Not Found

Solutions:

  1. Check URL validity:
// Verify the URL is accessible
const response = await fetch(sourceUrl);
console.log(response.status); // Should be 200
  1. Handle 404s gracefully:
// Use exclusions_sources to skip broken sources
const config = {
    name: 'My List',
    sources: [
        { name: 'Good', source: 'https://good.com/list.txt' },
        { name: 'Broken', source: 'https://broken.com/404.txt' },
    ],
    exclusions_sources: ['https://broken.com/404.txt'],
};
  1. Check circuit breaker:
Source temporarily disabled due to repeated failures

Wait 5 minutes for circuit breaker to reset, or check the source availability.

Transformation errors

Error:

TransformationError: Invalid rule at line 42

Solution: Enable validation transformation to see detailed errors:

const config = {
  name: "My List",
  sources: [...],
  transformations: [
    "Validate",  // Add this to see validation details
    "RemoveComments",
    "Deduplicate"
  ]
};

Memory issues

Error:

JavaScript heap out of memory

Solutions:

  1. Increase memory limit (Node.js):
node --max-old-space-size=4096 your-script.js
  1. Use streaming for large files:
// Process sources in chunks
const config = {
    sources: smallBatch, // Process 10-20 sources at a time
    transformations: ['Compress', 'Deduplicate'],
};
  1. Enable compression:
transformations: ['Compress']; // Reduces memory usage

Performance Issues

Slow compilation

Symptoms:

  • Compilation takes >60 seconds
  • High CPU usage
  • Unresponsive UI

Solutions:

  1. Enable caching (API/Worker):
// Cloudflare Worker automatically caches
// Check cache headers:
X-Cache-Status: HIT
  1. Use batch API for multiple lists:
// Compile in parallel
POST /compile/batch
{
  "requests": [
    { "id": "list1", "configuration": {...} },
    { "id": "list2", "configuration": {...} }
  ]
}
  1. Optimize transformations:
// Minimal transformations for speed
transformations: [
    'RemoveComments',
    'Deduplicate',
    'RemoveEmptyLines',
];

// Remove expensive transformations like:
// - Validate (checks every rule)
// - ConvertToAscii (processes every character)
  1. Check source count:
// Limit to 20-30 sources max
// Too many sources = slow compilation
console.log(config.sources.length);

High memory usage

Solution:

// Use Compress transformation
transformations: ['Compress', 'Deduplicate'];

// This reduces memory usage by 70-80%

Request deduplication not working

Issue: Multiple identical requests all compile instead of using cached result.

Solution: Ensure requests are identical:

// These are DIFFERENT requests (different order)
const req1 = { sources: [a, b] };
const req2 = { sources: [b, a] };

// These are IDENTICAL (will be deduplicated)
const req1 = { sources: [a, b] };
const req2 = { sources: [a, b] };

Check for deduplication:

X-Request-Deduplication: HIT

Network & API Issues

Rate limiting

Error:

429 Too Many Requests
Retry-After: 60

Solution: Respect rate limits:

const retryAfter = response.headers.get('Retry-After');
await new Promise((resolve) => setTimeout(resolve, retryAfter * 1000));

Rate limits:

  • Per IP: 60 requests/minute
  • Per endpoint: 100 requests/minute

CORS errors

Error:

Access to fetch at 'https://...' from origin 'https://...' has been blocked by CORS

Solution: Use the API endpoint which has CORS enabled:

// ✅ CORRECT - CORS enabled
fetch('https://adblock-compiler.jayson-knight.workers.dev/compile', {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({ configuration }),
});

// ❌ WRONG - Direct source fetch (no CORS)
fetch('https://random-site.com/list.txt');

Timeout errors

Error:

TimeoutError: Request timed out after 30000ms

Solution:

  1. Check source availability:
curl -I https://source-url.com/list.txt
  1. Circuit breaker will retry:
  • Automatic retry with exponential backoff
  • Up to 3 attempts
  • Then source is temporarily disabled
  1. Use fallback sources:
sources: [
    { name: 'Primary', source: 'https://primary.com/list.txt' },
    { name: 'Mirror', source: 'https://mirror.com/list.txt' }, // Fallback
];

SSL/TLS errors

Error:

error: Invalid certificate

Solution:

# Deno - use --unsafely-ignore-certificate-errors (not recommended)
deno run --unsafely-ignore-certificate-errors script.ts

# Better: Fix the source's SSL certificate
# Or use HTTP if available (less secure)

Cache Issues

Stale cache

Issue: API returns old/outdated results.

Solution:

  1. Check cache age:
const response = await fetch('/compile', {...});
console.log(response.headers.get('X-Cache-Age'));  // Seconds
  1. Force cache refresh: Add a unique parameter:
const config = {
  name: "My List",
  version: new Date().toISOString(),  // Forces new cache key
  sources: [...]
};
  1. Cache TTL:
  • Default: 1 hour
  • Max: 24 hours

Cache miss rate high

Issue:

X-Cache-Status: MISS

Most requests miss cache.

Solution: Use consistent configuration:

// BAD - timestamp changes every time
const config = {
  name: "My List",
  version: Date.now().toString(),  // Always different!
  sources: [...]
};

// GOOD - stable configuration
const config = {
  name: "My List",
  version: "1.0.0",  // Static version
  sources: [...]
};

Compressed cache errors

Error:

DecompressionError: Invalid compressed data

Solution: Clear cache and recompile:

// Cache will be automatically rebuilt
// If persistent, file a GitHub issue

Deployment Issues

deno: not found error during deployment

Error:

Executing user deploy command: deno deploy
/bin/sh: 1: deno: not found
Failed: error occurred while running deploy command

Cause: This error occurs when Cloudflare Pages is configured with deno deploy as the deploy command. This project uses Cloudflare Workers (not Deno Deploy) and should use wrangler deploy instead.

Solution: Update your Cloudflare Pages dashboard configuration:

  1. Go to your Pages project settings
  2. Navigate to "Builds & deployments"
  3. Under "Build configuration":
    • Set Build command to: npm install
    • Set Deploy command to: (leave empty)
    • Set Build output directory to: public
    • Set Root directory to: (leave empty)
  4. Save changes and redeploy

For detailed instructions, see the Cloudflare Pages Deployment Guide.

Why this happens:

  • This is a Deno-based project, but it deploys to Cloudflare Workers, not Deno Deploy
  • The build environment has Node.js/pnpm but not Deno installed
  • Wrangler handles the deployment automatically

Cloudflare Worker deployment fails

Error:

Error: Worker exceeded memory limit

Solutions:

  1. Check bundle size:
du -h dist/worker.js
# Should be < 1MB
  1. Minify code:
deno bundle --minify src/worker.ts dist/worker.js
  1. Remove unused imports:
// BAD
import * as everything from '@jk-com/adblock-compiler';

// GOOD
import { compile, FilterCompiler } from '@jk-com/adblock-compiler';

Worker KV errors

Error:

KV namespace not found

Solution: Ensure KV namespace is bound in wrangler.toml:

[[kv_namespaces]]
binding = "CACHE"
id = "your-kv-namespace-id"

Create namespace:

wrangler kv:namespace create CACHE

Environment variables not set

Error:

ReferenceError: CACHE is not defined

Solution: Add bindings in wrangler.toml:

[env.production]
vars = { ENVIRONMENT = "production" }

[[env.production.kv_namespaces]]
binding = "CACHE"
id = "production-kv-id"

Platform-Specific Issues

Deno issues

Issue: Import map not working

Solution:

# Use deno.json, not import_map.json
# Ensure deno.json is in project root

Issue: Type errors

Solution:

# Clear Deno cache
rm -rf ~/.cache/deno
deno cache --reload src/main.ts

Node.js issues

Issue: ES modules not supported

Solution: Add to package.json:

{
    "type": "module"
}

Or use .mjs extension:

mv index.js index.mjs

Issue: CommonJS require() not working

Solution:

// Use dynamic import
const { compile } = await import('@jk-com/adblock-compiler');

// Or convert to ES modules

Browser issues

Issue: Module not found

Solution: Use a bundler (esbuild, webpack):

npm install -D esbuild
npx esbuild src/main.ts --bundle --outfile=dist/bundle.js

Issue: CORS with local files

Solution: Run a local server:

# Python
python -m http.server 8000

# Deno
deno run --allow-net --allow-read https://deno.land/std/http/file_server.ts

# Node
npx serve .

Getting Help

Enable debug logging

// Set environment variable
Deno.env.set('DEBUG', 'true');

// Or in .env file
DEBUG = true;

Collect diagnostics

# System info
deno --version
node --version

# Network test
curl -I https://adblock-compiler.jayson-knight.workers.dev/api

# Permissions test
deno run --allow-net test.ts

Report an issue

Include:

  1. Error message (full stack trace)
  2. Minimal reproduction code
  3. Configuration file (sanitized)
  4. Platform/version info
  5. Steps to reproduce

GitHub Issues: https://github.com/jaypatrick/adblock-compiler/issues

Community support

Quick Fixes Checklist

  • Updated to latest version?
  • Cleared cache? (rm -rf ~/.cache/deno or rm -rf node_modules)
  • Correct permissions? (--allow-net --allow-read)
  • Valid configuration? (name + sources required)
  • Network connectivity? (curl -I <source-url>)
  • Rate limits respected? (60 req/min)
  • Checked GitHub issues? (Someone may have solved it)

Still stuck? Open an issue with full details!

Validation Error Tracking

This document describes how validation errors are tracked and displayed through the agtree integration.

Overview

The compiler now tracks all validation errors encountered during the validation transformation. This provides detailed feedback about why specific rules were rejected, making it easier to debug filter lists and understand what's happening during compilation.

Features

  • Comprehensive Error Tracking: All validation errors are collected with detailed context
  • Error Types: Different error types (parse errors, syntax errors, unsupported modifiers, etc.)
  • Severity Levels: Errors, warnings, and info messages
  • Line Numbers: Track which line in the source caused the error
  • Source Attribution: Know which source file an error came from
  • UI Display: User-friendly error display with filtering and export capabilities

Error Types

The following validation error types are tracked:

Error TypeDescription
parse_errorRule failed to parse via AGTree
syntax_errorInvalid syntax detected
unsupported_modifierModifier not supported for DNS blocking
invalid_hostnameHostname format is invalid
ip_not_allowedIP addresses not permitted
pattern_too_shortPattern doesn't meet minimum length requirement
public_suffix_matchMatching entire public suffix (too broad)
invalid_charactersPattern contains invalid characters
cosmetic_not_supportedCosmetic rules not supported for DNS blocking
modifier_validation_failedAGTree modifier validation warning

Severity Levels

  • Error: Rule will be removed from the output
  • Warning: Rule may have issues but is kept
  • Info: Informational message

Usage in Code

TypeScript/JavaScript

import { ValidateTransformation } from './transformations/ValidateTransformation.ts';
import { ValidationReport } from './types/validation.ts';

// Create validator
const validator = new ValidateTransformation(false /* allowIp */);

// Optionally set source name for error tracking
validator.setSourceName('AdGuard DNS Filter');

// Execute validation
const validRules = validator.executeSync(rules);

// Get validation report
const report: ValidationReport = validator.getValidationReport(
    rules.length,
    validRules.length
);

// Check results
console.log(`Errors: ${report.errorCount}`);
console.log(`Warnings: ${report.warningCount}`);
console.log(`Valid: ${report.validRules}/${report.totalRules}`);

// Iterate through errors
for (const error of report.errors) {
    console.log(`[${error.severity}] ${error.message}`);
    console.log(`  Rule: ${error.ruleText}`);
    if (error.lineNumber) {
        console.log(`  Line: ${error.lineNumber}`);
    }
}

Web UI

To display validation reports in your web UI, include the validation UI component and manually integrate it:

<!-- Include validation UI script -->
<script src="validation-ui.js"></script>

<script>
  // Show validation report
  const report = {
    totalRules: 1000,
    validRules: 950,
    invalidRules: 50,
    errorCount: 45,
    warningCount: 5,
    infoCount: 0,
    errors: [
      {
        type: 'unsupported_modifier',
        severity: 'error',
        ruleText: '||example.com^$popup',
        message: 'Unsupported modifier: popup',
        details: 'Supported modifiers: important, ~important, ctag, dnstype, dnsrewrite',
        lineNumber: 42,
        sourceName: 'Custom Filter'
      }
    ]
  };

  ValidationUI.showReport(report);
</script>

Validation Report Structure

interface ValidationReport {
    /** Total number of errors */
    errorCount: number;
    /** Total number of warnings */
    warningCount: number;
    /** Total number of info messages */
    infoCount: number;
    /** List of all validation errors */
    errors: ValidationError[];
    /** Total rules validated */
    totalRules: number;
    /** Valid rules count */
    validRules: number;
    /** Invalid rules count (removed) */
    invalidRules: number;
}

interface ValidationError {
    /** Type of validation error */
    type: ValidationErrorType;
    /** Severity level */
    severity: ValidationSeverity;
    /** The rule text that failed validation */
    ruleText: string;
    /** Line number in the original source */
    lineNumber?: number;
    /** Human-readable error message */
    message: string;
    /** Additional context or details */
    details?: string;
    /** The parsed AST node (if available) */
    ast?: AnyRule;
    /** Source name */
    sourceName?: string;
}

UI Features

Summary Cards

The validation report shows summary cards with:

  • Total rules processed
  • Valid rules count
  • Invalid rules count
  • Error count
  • Warning count

Error List

  • Filtering: Filter by severity (All, Errors, Warnings)
  • Details: Each error shows:
    • Severity badge
    • Error type
    • Line number
    • Source name
    • Message
    • Details/explanation
    • The actual rule text
  • Color Coding: Errors, warnings, and info messages use different colors
  • Export: Download the full validation report as JSON

Dark Mode Support

The validation UI fully supports dark mode and will adapt to the current theme.

Color Coding

The validation UI uses comprehensive color coding for better visual understanding:

Error Type Colors

Each error type has a unique color scheme:

  • Parse/Syntax Errors - Red (#dc3545)
  • Unsupported Modifier - Orange (#fd7e14)
  • Invalid Hostname - Pink (#e83e8c)
  • IP Not Allowed - Purple (#6610f2)
  • Pattern Too Short - Yellow (#ffc107)
  • Public Suffix Match - Light Red (#ff6b6b)
  • Invalid Characters - Magenta (#d63384)
  • Cosmetic Not Supported - Cyan (#0dcaf0)

Rule Syntax Highlighting

Rules are syntax-highlighted based on their type:

  • Network rules: Domain in blue, modifiers in orange, separators in gray
  • Exception rules: @@ prefix in green
  • Host rules: IP address in purple, domain in blue
  • Cosmetic rules: Selector in green, separator in magenta
  • Comments: Gray and italic

Problematic parts are highlighted with a colored background matching the error type.

AST Node Colors

When viewing the parsed AST structure, nodes are color-coded by type:

  • Network Category - Blue (#0d6efd)
  • Network Rule - Light Blue (#0dcaf0)
  • Host Rule - Purple (#6610f2)
  • Cosmetic Rule - Pink (#d63384)
  • Modifier - Orange (#fd7e14)
  • Comment - Gray (#6c757d)
  • Invalid Rule - Red (#dc3545)

Value Type Colors

In the AST visualization, values are colored by type:

  • Boolean true - Green (#198754)
  • Boolean false - Red (#dc3545)
  • Numbers - Purple (#6610f2)
  • Strings - Blue (#0d6efd)

Integration with Compiler

The FilterCompiler and WorkerCompiler can be extended to return validation reports:

interface CompilationResult {
    rules: string[];
    validation?: ValidationReport;
    // ... other properties
}

Example Output

Console Output

[ERROR] Unsupported modifier: popup
  Rule: ||example.com^$popup
  Line: 42
  Source: Custom Filter

[ERROR] Pattern too short
  Rule: ||ad^
  Line: 156
  Details: Minimum pattern length is 5 characters

[WARNING] Modifier validation warning
  Rule: ||ads.com^$important,dnstype=A
  Details: Modifier combination may have unexpected behavior

JSON Export

{
  "errorCount": 2,
  "warningCount": 1,
  "infoCount": 0,
  "totalRules": 1000,
  "validRules": 997,
  "invalidRules": 3,
  "errors": [
    {
      "type": "unsupported_modifier",
      "severity": "error",
      "ruleText": "||example.com^$popup",
      "message": "Unsupported modifier: popup",
      "details": "Supported modifiers: important, ~important, ctag, dnstype, dnsrewrite",
      "lineNumber": 42,
      "sourceName": "Custom Filter"
    }
  ]
}

Best Practices

  1. Always check the validation report after compilation to understand what was filtered out
  2. Use source names when validating multiple sources to track which source has issues
  3. Export reports for debugging and sharing with filter list maintainers
  4. Filter by severity to focus on critical errors first
  5. Review warnings as they may indicate potential issues even if rules are kept

Future Enhancements

Potential improvements for validation error tracking:

  • Suggestions for fixing common errors
  • Rule rewriting suggestions
  • Batch validation of multiple filter lists
  • Historical tracking of validation issues
  • Integration with external filter list validators
  • Automatic issue reporting to filter list repositories

Configuration

Back to README

Configuration defines your filter list sources, and the transformations that are applied to the sources.

Here is an example of this configuration:

{
    "name": "List name",
    "description": "List description",
    "homepage": "https://example.org/",
    "license": "GPLv3",
    "version": "1.0.0.0",
    "sources": [
        {
            "name": "Local rules",
            "source": "rules.txt",
            "type": "adblock",
            "transformations": ["RemoveComments", "Compress"],
            "exclusions": ["excluded rule 1"],
            "exclusions_sources": ["exclusions.txt"],
            "inclusions": ["*"],
            "inclusions_sources": ["inclusions.txt"]
        },
        {
            "name": "Remote rules",
            "source": "https://example.org/rules",
            "type": "hosts",
            "exclusions": ["excluded rule 1"]
        }
    ],
    "transformations": ["Deduplicate", "Compress"],
    "exclusions": ["excluded rule 1", "excluded rule 2"],
    "exclusions_sources": ["global_exclusions.txt"],
    "inclusions": ["*"],
    "inclusions_sources": ["global_inclusions.txt"]
}
  • name - (mandatory) the list name.
  • description - (optional) the list description.
  • homepage - (optional) URL to the list homepage.
  • license - (optional) Filter list license.
  • version - (optional) Filter list version.
  • sources - (mandatory) array of the list sources.
    • .source - (mandatory) path or URL of the source. It can be a traditional filter list or a hosts file.
    • .name - (optional) name of the source.
    • .type - (optional) type of the source. It could be adblock for Adblock-style lists or hosts for /etc/hosts style lists. If not specified, adblock is assumed.
    • .transformations - (optional) a list of transformations to apply to the source rules. By default, no transformations are applied. Learn more about possible transformations here.
    • .exclusions - (optional) a list of rules (or wildcards) to exclude from the source.
    • .exclusions_sources - (optional) a list of files with exclusions.
    • .inclusions - (optional) a list of wildcards to include from the source. All rules that don't match these wildcards won't be included.
    • .inclusions_sources - (optional) a list of files with inclusions.
  • transformations - (optional) a list of transformations to apply to the final list of rules. By default, no transformations are applied. Learn more about possible transformations here.
  • exclusions - (optional) a list of rules (or wildcards) to exclude from the source.
  • exclusions_sources - (optional) a list of files with exclusions.
  • inclusions - (optional) a list of wildcards to include from the source. All rules that don't match these wildcards won't be included.
  • inclusions_sources - (optional) a list of files with inclusions.

Here is an example of a minimal configuration:

{
    "name": "test list",
    "sources": [
        {
            "source": "rules.txt"
        }
    ]
}

Exclusion and inclusion rules

Please note, that exclusion or inclusion rules may be a plain string, wildcard, or a regular expression.

  • plainstring - every rule that contains plainstring will match the rule
  • *.plainstring - every rule that matches this wildcard will match the rule
  • /regex/ - every rule that matches this regular expression, will match the rule. By default, regular expressions are case-insensitive.
  • ! comment - comments will be ignored.

[!IMPORTANT] Ensure that rules in the exclusion list match the format of the rules in the filter list. To maintain a consistent format, add the Compress transformation to convert /etc/hosts rules to adblock syntax. This is especially useful if you have multiple lists in different formats.

Here is an example:

Rules in HOSTS syntax: /hosts.txt

0.0.0.0 ads.example.com
0.0.0.0 tracking.example1.com
0.0.0.0 example.com

Exclusion rules in adblock syntax: /exclusions.txt

||example.com^

Configuration of the final list:

{
    "name": "List name",
    "description": "List description",
    "sources": [
        {
            "name": "HOSTS rules",
            "source": "hosts.txt",
            "type": "hosts",
            "transformations": ["Compress"]
        }
    ],
    "transformations": ["Deduplicate", "Compress"],
    "exclusions_sources": ["exclusions.txt"]
}

Final filter output of /hosts.txt after applying the Compress transformation and exclusions:

||ads.example.com^
||tracking.example1.com^

The last rule ||example.com^ will correctly match the rule from the exclusion list and will be excluded.

CLI Reference

Back to README

The adblock-compiler CLI is the primary entry-point for compiling filter lists locally with full control over the transformation pipeline, HTTP fetching, filtering, and output.

Installation

# Run directly with Deno (no install)
deno run --allow-read --allow-write --allow-net jsr:@jk-com/adblock-compiler/cli -c config.json -o output.txt

# Install globally
deno install --allow-read --allow-write --allow-net -n adblock-compiler jsr:@jk-com/adblock-compiler/cli

Usage

adblock-compiler [options]

Options

General

FlagShortTypeDescription
--config <file>-cstringPath to the compiler configuration file
--input <source>-istring[]URL or file path to compile (repeatable)
--input-type <type>-thosts|adblockInput format [default: hosts]
--verbose-vbooleanEnable verbose logging
--benchmark-bbooleanShow performance benchmark report
--use-queue-qbooleanSubmit job to async queue (requires worker API)
--priority <level>standard|highQueue priority [default: standard]
--versionbooleanShow version number
--help-hbooleanShow help

Either --config or --input must be provided (but not both).


Output

FlagShortTypeDescription
--output <file>-ostringOutput file path [required unless --stdout]
--stdoutbooleanWrite output to stdout instead of a file
--appendbooleanAppend to output file instead of overwriting
--format <format>stringOutput format
--name <file>stringCompare output against an existing file and print a summary of added/removed rules
--max-rules <n>numberTruncate output to at most n rules

--stdout and --output are mutually exclusive.


Transformation Control

When no transformation flags are specified, the default pipeline is used: RemoveCommentsDeduplicateCompressValidateTrimLinesInsertFinalNewLine

FlagTypeDescription
--no-commentsbooleanSkip the RemoveComments transformation
--no-deduplicatebooleanSkip the Deduplicate transformation
--no-compressbooleanSkip the Compress transformation
--no-validatebooleanSkip the Validate transformation
--allow-ipbooleanReplace Validate with ValidateAllowIp (keeps IP-address rules)
--invert-allowbooleanAppend the InvertAllow transformation
--remove-modifiersbooleanAppend the RemoveModifiers transformation
--convert-to-asciibooleanAppend the ConvertToAscii transformation
--transformation <name>string[]Override the entire pipeline (repeatable). When provided, all other transformation flags are ignored.

Available transformation names for --transformation:

NameDescription
RemoveCommentsRemove ! and # comment lines
DeduplicateRemove duplicate rules
CompressConvert hosts-format rules to adblock syntax and remove redundant entries
ValidateRemove dangerous or incompatible rules (strips IP-address rules)
ValidateAllowIpLike Validate but keeps IP-address rules
InvertAllowConvert blocking rules to allow/exception rules
RemoveModifiersStrip unsupported modifiers ($third-party, $document, etc.)
TrimLinesRemove leading/trailing whitespace from each line
InsertFinalNewLineEnsure the output ends with a newline
RemoveEmptyLinesRemove blank lines
ConvertToAsciiConvert non-ASCII hostnames to Punycode

See TRANSFORMATIONS.md for detailed descriptions of each transformation.


Filtering

These flags apply globally to the compiled output (equivalent to IConfiguration.exclusions / inclusions).

FlagTypeDescription
--exclude <pattern>string[]Exclude rules matching the pattern (repeatable). Supports exact strings, * wildcards, and /regex/ patterns. Maps to exclusions[].
--exclude-from <file>string[]Load exclusion patterns from a file (repeatable). Maps to exclusions_sources[].
--include <pattern>string[]Include only rules matching the pattern (repeatable). Maps to inclusions[].
--include-from <file>string[]Load inclusion patterns from a file (repeatable). Maps to inclusions_sources[].

When used with --config, these flags are overlaid on top of any exclusions / inclusions already defined in the config file.


Networking

FlagTypeDescription
--timeout <ms>numberHTTP request timeout in milliseconds
--retries <n>numberNumber of HTTP retry attempts (uses exponential backoff)
--user-agent <string>stringCustom User-Agent header for HTTP requests

Examples

Basic compilation from a config file

adblock-compiler -c config.json -o output.txt

Compile from multiple URL sources

adblock-compiler \
  -i https://example.org/hosts.txt \
  -i https://example.org/extra.txt \
  -o output.txt

Stream output to stdout

adblock-compiler -i https://example.org/hosts.txt --stdout

Skip specific transformations

# Keep IP-address rules and skip compression
adblock-compiler -c config.json -o output.txt --allow-ip --no-compress

# Skip deduplication (faster, output may contain duplicates)
adblock-compiler -c config.json -o output.txt --no-deduplicate

Explicit transformation pipeline

# Only remove comments and deduplicate — no compression or validation
adblock-compiler -i https://example.org/hosts.txt -o output.txt \
  --transformation RemoveComments \
  --transformation Deduplicate \
  --transformation TrimLines \
  --transformation InsertFinalNewLine

Filtering rules from output

# Exclude specific domain patterns
adblock-compiler -c config.json -o output.txt \
  --exclude "*.cdn.example.com" \
  --exclude "ads.example.org"

# Load exclusion list from a file
adblock-compiler -c config.json -o output.txt \
  --exclude-from my-whitelist.txt

# Include only rules matching a pattern
adblock-compiler -c config.json -o output.txt \
  --include "*.example.com"

# Load inclusion list from a file
adblock-compiler -c config.json -o output.txt \
  --include-from my-allowlist.txt

Limit output size

# Truncate to first 50,000 rules
adblock-compiler -c config.json -o output.txt --max-rules 50000

Compare output against a previous build

adblock-compiler -c config.json -o output.txt --name output.txt.bak
# Output:
# Comparison with output.txt.bak:
#   Added:   +42 rules
#   Removed: -7 rules
#   Net:     +35 rules

Append to an existing output file

adblock-compiler -i extra.txt -o output.txt --append

Custom networking options

adblock-compiler -c config.json -o output.txt \
  --timeout 15000 \
  --retries 5 \
  --user-agent "MyListBot/1.0"

Verbose benchmarking

adblock-compiler -c config.json -o output.txt --verbose --benchmark

Configuration File

When using --config, the compiler reads an IConfiguration JSON file. The CLI filtering and transformation flags are applied as an overlay on top of what is defined in that file.

See CONFIGURATION.md for the full configuration file reference.

Configuration

Back to README

Configuration defines your filter list sources, and the transformations that are applied to the sources.

Here is an example of this configuration:

{
    "name": "List name",
    "description": "List description",
    "homepage": "https://example.org/",
    "license": "GPLv3",
    "version": "1.0.0.0",
    "sources": [
        {
            "name": "Local rules",
            "source": "rules.txt",
            "type": "adblock",
            "transformations": ["RemoveComments", "Compress"],
            "exclusions": ["excluded rule 1"],
            "exclusions_sources": ["exclusions.txt"],
            "inclusions": ["*"],
            "inclusions_sources": ["inclusions.txt"]
        },
        {
            "name": "Remote rules",
            "source": "https://example.org/rules",
            "type": "hosts",
            "exclusions": ["excluded rule 1"]
        }
    ],
    "transformations": ["Deduplicate", "Compress"],
    "exclusions": ["excluded rule 1", "excluded rule 2"],
    "exclusions_sources": ["global_exclusions.txt"],
    "inclusions": ["*"],
    "inclusions_sources": ["global_inclusions.txt"]
}
  • name - (mandatory) the list name.
  • description - (optional) the list description.
  • homepage - (optional) URL to the list homepage.
  • license - (optional) Filter list license.
  • version - (optional) Filter list version.
  • sources - (mandatory) array of the list sources.
    • .source - (mandatory) path or URL of the source. It can be a traditional filter list or a hosts file.
    • .name - (optional) name of the source.
    • .type - (optional) type of the source. It could be adblock for Adblock-style lists or hosts for /etc/hosts style lists. If not specified, adblock is assumed.
    • .transformations - (optional) a list of transformations to apply to the source rules. By default, no transformations are applied. Learn more about possible transformations here.
    • .exclusions - (optional) a list of rules (or wildcards) to exclude from the source.
    • .exclusions_sources - (optional) a list of files with exclusions.
    • .inclusions - (optional) a list of wildcards to include from the source. All rules that don't match these wildcards won't be included.
    • .inclusions_sources - (optional) a list of files with inclusions.
  • transformations - (optional) a list of transformations to apply to the final list of rules. By default, no transformations are applied. Learn more about possible transformations here.
  • exclusions - (optional) a list of rules (or wildcards) to exclude from the source.
  • exclusions_sources - (optional) a list of files with exclusions.
  • inclusions - (optional) a list of wildcards to include from the source. All rules that don't match these wildcards won't be included.
  • inclusions_sources - (optional) a list of files with inclusions.

Here is an example of a minimal configuration:

{
    "name": "test list",
    "sources": [
        {
            "source": "rules.txt"
        }
    ]
}

Exclusion and inclusion rules

Please note, that exclusion or inclusion rules may be a plain string, wildcard, or a regular expression.

  • plainstring - every rule that contains plainstring will match the rule
  • *.plainstring - every rule that matches this wildcard will match the rule
  • /regex/ - every rule that matches this regular expression, will match the rule. By default, regular expressions are case-insensitive.
  • ! comment - comments will be ignored.

[!IMPORTANT] Ensure that rules in the exclusion list match the format of the rules in the filter list. To maintain a consistent format, add the Compress transformation to convert /etc/hosts rules to adblock syntax. This is especially useful if you have multiple lists in different formats.

Here is an example:

Rules in HOSTS syntax: /hosts.txt

0.0.0.0 ads.example.com
0.0.0.0 tracking.example1.com
0.0.0.0 example.com

Exclusion rules in adblock syntax: /exclusions.txt

||example.com^

Configuration of the final list:

{
    "name": "List name",
    "description": "List description",
    "sources": [
        {
            "name": "HOSTS rules",
            "source": "hosts.txt",
            "type": "hosts",
            "transformations": ["Compress"]
        }
    ],
    "transformations": ["Deduplicate", "Compress"],
    "exclusions_sources": ["exclusions.txt"]
}

Final filter output of /hosts.txt after applying the Compress transformation and exclusions:

||ads.example.com^
||tracking.example1.com^

The last rule ||example.com^ will correctly match the rule from the exclusion list and will be excluded.

Transformations

Back to README

Here is the full list of transformations that are available:

  1. ConvertToAscii
  2. TrimLines
  3. RemoveComments
  4. Compress
  5. RemoveModifiers
  6. InvertAllow
  7. Validate
  8. ValidateAllowIp
  9. Deduplicate
  10. RemoveEmptyLines
  11. InsertFinalNewLine

Please note that these transformations are always applied in the order specified here.

RemoveComments

This is a very simple transformation that simply removes comments (e.g. all rules starting with ! or #).

Compress

[!IMPORTANT] This transformation converts hosts lists into adblock lists.

Here's what it does:

  1. It converts all rules to adblock-style rules. For instance, 0.0.0.0 example.org will be converted to ||example.org^.
  2. It discards the rules that are now redundant because of other existing rules. For instance, ||example.org blocks example.org and all it's subdomains, therefore additional rules for the subdomains are now redundant.

RemoveModifiers

By default, AdGuard Home will ignore rules with unsupported modifiers, and all of the modifiers listed here are unsupported. However, the rules with these modifiers are likely to be okay for DNS-level blocking, that's why you might want to remove them when importing rules from a traditional filter list.

Here is the list of modifiers that will be removed:

  • $third-party and $3p modifiers
  • $document and $doc modifiers
  • $all modifier
  • $popup modifier
  • $network modifier

[!CAUTION] Blindly removing $third-party from traditional ad blocking rules leads to lots of false-positives.

This is exactly why there is an option to exclude rules - you may need to use it.

Validate

This transformation is really crucial if you're using a filter list for a traditional ad blocker as a source.

It removes dangerous or incompatible rules from the list.

So here's what it does:

  • Discards domain-specific rules (e.g. ||example.org^$domain=example.com). You don't want to have domain-specific rules working globally.
  • Discards rules with unsupported modifiers. Click here to learn more about which modifiers are supported.
  • Discards rules that are too short.
  • Discards IP addresses. If you need to keep IP addresses, use ValidateAllowIp instead.
  • Removes rules that block entire top-level domains (TLDs) like ||*.org^, unless they have specific limiting modifiers such as $denyallow, $badfilter, or $client. Examples:
    • ||*.org^ - this rule will be removed
    • ||*.org^$denyallow=example.com - this rule will be kept because it has a limiting modifier

If there are comments preceding the invalid rule, they will be removed as well.

ValidateAllowIp

This transformation exactly repeats the behavior of Validate, but leaves the IP addresses in the lists.

Deduplicate

This transformation simply removes the duplicates from the specified source.

There are two important notes about this transformation:

  1. It keeps the original rules order.
  2. It ignores comments. However, if the comments precede the rule that is being removed, the comments will be also removed.

For instance:

! rule1 comment 1
rule1
! rule1 comment 2
rule1

Here's what will be left after the transformation:

! rule1 comment 2
rule1

InvertAllow

This transformation converts blocking rules to "allow" rules. Note, that it does nothing to /etc/hosts rules (unless they were previously converted to adblock-style syntax by a different transformation, for example Compress).

There are two important notes about this transformation:

  1. It keeps the original rules order.
  2. It ignores comments, empty lines, /etc/hosts rules and existing "allow" rules.

Example:

Original list:

! comment 1
rule1

# comment 2
192.168.11.11   test.local
@@rule2

Here's what we will have after applying this transformation:

! comment 1
@@rule1

# comment 2
192.168.11.11   test.local
@@rule2

RemoveEmptyLines

This is a very simple transformation that removes empty lines.

Example:

Original list:

rule1

rule2


rule3

Here's what we will have after applying this transformation:

rule1
rule2
rule3

TrimLines

This is a very simple transformation that removes leading and trailing spaces/tabs.

Example:

Original list:

rule1
   rule2
rule3
		rule4

Here's what we will have after applying this transformation:

rule1
rule2
rule3
rule4

InsertFinalNewLine

This is a very simple transformation that inserts a final newline.

Example:

Original list:

rule1
rule2
rule3

Here's what we will have after applying this transformation:

rule1
rule2
rule3

RemoveEmptyLines doesn't delete this empty row due to the execution order.

ConvertToAscii

This transformation converts all non-ASCII characters to their ASCII equivalents. It is always performed first.

Example:

Original list:

||*.рус^
||*.कॉम^
||*.セール^

Here's what we will have after applying this transformation:

||*.xn--p1acf^
||*.xn--11b4c3d^
||*.xn--1qqw23a^

Postman Collection

Postman collection and environment files for testing the Adblock Compiler API.

Auto-generated — do not edit these files directly. Run deno task postman:collection to regenerate from docs/api/openapi.yaml.

Files

  • postman-collection.json - Postman collection with all API endpoints and tests (auto-generated)
  • postman-environment.json - Postman environment with local and production variables (auto-generated)

Regenerating

Both files are generated automatically from the canonical OpenAPI spec:

deno task postman:collection

The CI pipeline (validate-postman-collection job) enforces that these files stay in sync with docs/api/openapi.yaml. If you modify the spec, run the task above and commit the updated files — CI will fail otherwise.

Schema hierarchy

docs/api/openapi.yaml                 ← canonical source of truth (edit this)
docs/api/cloudflare-schema.yaml       ← auto-generated (deno task schema:cloudflare)
docs/postman/postman-collection.json  ← auto-generated (deno task postman:collection)
docs/postman/postman-environment.json ← auto-generated (deno task postman:collection)

Quick Start

  1. Open Postman and click Import
  2. Import postman-collection.json to add all API requests
  3. Import postman-environment.json to configure environments
  4. Select the Adblock Compiler API - Local environment
  5. Start the server: deno task dev
  6. Run requests individually or as a collection

Reference Documentation

Reference material, configuration guides, and project information.

Contents

Automatic Version Bumping

This document explains how automatic version bumping works in the adblock-compiler project using Conventional Commits.

Overview

The project uses Conventional Commits to automatically determine version bumps following Semantic Versioning (SemVer).

How It Works

Automatic Trigger

The version-bump.yml workflow automatically runs when:

  • Code is pushed to main or master branch
  • A PR is merged to the main branch

It can also be triggered manually with a specific version bump type.

Version Bump Rules

Version bumps are determined by analyzing commit messages:

Commit TypeVersion BumpExampleOld → New
feat:Minor (0.x.0)feat: add new transformation0.12.0 → 0.13.0
fix:Patch (0.0.x)fix: resolve parsing error0.12.0 → 0.12.1
perf:Patch (0.0.x)perf: optimize rule matching0.12.0 → 0.12.1
feat!: or BREAKING CHANGE:Major (x.0.0)feat!: change API interface0.12.0 → 1.0.0
chore:, docs:, style:, refactor:, test:, ci:Nonedocs: update READMENo bump

Conventional Commit Format

<type>[optional scope]: <description>

[optional body]

[optional footer(s)]

Examples:

# Minor version bump (new feature)
feat: add WebSocket support for real-time compilation

# Patch version bump (bug fix)
fix: correct version synchronization in worker

# Patch version bump (performance)
perf: improve rule deduplication speed

# Major version bump (breaking change)
feat!: change compiler API to async-only

# Alternative breaking change syntax
feat: migrate to new configuration format

BREAKING CHANGE: Configuration now requires 'version' field

Workflow Behavior

1. Commit Analysis

The workflow analyzes all commits since the last version bump:

# Gets commits since last "chore: bump version" commit
git log --grep="chore: bump version" -n 1
git log <last-version>..HEAD

2. Version Bump Decision

  • Scans commit messages for conventional commit types
  • Determines the highest priority bump needed:
    • Major takes precedence over minor and patch
    • Minor takes precedence over patch
    • Patch is the lowest priority

3. File Updates

If a version bump is needed, the workflow updates:

  1. deno.json - Package version
  2. package.json - NPM package version
  3. src/version.ts - VERSION constant
  4. wrangler.toml - COMPILER_VERSION variable
  5. CHANGELOG.md - Auto-generated changelog entry

4. Changelog Generation

The workflow automatically generates a changelog entry with:

  • Added section - Features from feat: commits
  • Fixed section - Bug fixes from fix: commits
  • Performance section - Improvements from perf: commits
  • BREAKING CHANGES section - Breaking changes from commit footers

5. Pull Request Creation

The workflow:

  1. Creates a new branch: auto-version-bump-X.Y.Z
  2. Commits changes with message: chore: bump version to X.Y.Z
  3. Pushes the branch to the repository
  4. Creates a pull request with the version bump changes

6. Tag Creation and Release

After the version bump PR is merged:

  1. The create-version-tag.yml workflow is triggered
  2. It creates a git tag: vX.Y.Z
  3. The tag automatically triggers the release.yml workflow which:
    • Builds binaries for all platforms
    • Publishes to JSR (JavaScript Registry)
    • Creates a GitHub Release

Skipping Version Bumps

To skip automatic version bumping, include one of these in your commit message:

git commit -m "docs: update README [skip ci]"
git commit -m "chore: update dependencies [skip version]"

Manual Version Bump

If you need to manually bump the version:

Option 1: Use the Workflow Dispatch

You can manually trigger the version bump workflow:

# Go to Actions → Version Bump → Run workflow
# Select bump type: patch, minor, or major (or leave empty for auto-detect)
# Optionally check "Create a release after bumping"

Best Practices

Writing Good Commit Messages

Good Examples:

feat: add batch compilation endpoint
feat(worker): implement queue-based processing
fix: resolve memory leak in rule parser
fix(validation): handle edge case for IPv6 addresses
perf: optimize deduplication algorithm
docs: add API documentation for streaming
chore: update dependencies

Bad Examples:

added feature              # Missing type prefix
Fix bug                    # Incorrect capitalization
feat add new feature       # Missing colon
update code                # Too vague, missing type

Commit Message Structure

  1. Type: Use appropriate type (feat, fix, perf, etc.)
  2. Scope (optional): Component affected (worker, compiler, api)
  3. Description: Clear, concise description in imperative mood
  4. Body (optional): Detailed explanation of changes
  5. Footer (optional): Breaking changes, issue references

Breaking Changes

When introducing breaking changes:

# Option 1: Use ! after type
feat!: change API to async-only

# Option 2: Use footer
feat: migrate to new config format

BREAKING CHANGE: Configuration schema has changed.
Old format is no longer supported. See migration guide.

Troubleshooting

No Version Bump Occurred

Cause: No commits with feat:, fix:, or perf: since last bump

Solution:

  • Check commit messages follow conventional format
  • Ensure commits are pushed to main branch
  • Verify workflow wasn't skipped with [skip ci] or [skip version]

Wrong Version Bump Type

Cause: Incorrect commit message format

Solution:

  • Review commit messages since last bump
  • Use manual workflow to override if needed
  • Update commit messages and force-push (if not yet released)

Workflow Failed

Cause: Various (permissions, conflicts, etc.)

Solution:

  1. Check workflow logs in GitHub Actions
  2. Ensure GITHUB_TOKEN has write permissions
  3. Verify no conflicts in version files
  4. Check that all version files exist

Multiple Bumps in One Push

Cause: Multiple commits requiring different bump types

Solution:

  • The workflow automatically selects the highest priority bump
  • Major > Minor > Patch
  • Only one version bump per workflow run

Integration with Other Workflows

Version Bump Flow

Version Bump (auto or manual) → Creates PR → PR Merged → Create Version Tag → Triggers Release Workflow

The complete flow:

  1. Version Bump: Analyzes commits (or uses manual input) and creates a PR with version changes
  2. PR Review: Human or automated review/merge of the PR
  3. Create Version Tag: Automatically creates tag after PR merge
  4. Release Workflow: Builds, publishes, and creates GitHub release

CI Workflow

The CI workflow runs on:

  • Pull requests (before merge)
  • Pushes to any branch

Version bump workflow runs:

  • Automatically on pushes to main/master (analyzes commits)
  • Manually via workflow dispatch (specify bump type)
  • After PR is merged to main/master

Configuration

Workflow File

Location: .github/workflows/version-bump.yml

This consolidated workflow handles both automatic (conventional commits) and manual version bumping.

Customization

To customize behavior, edit the workflow file:

# Change branches that trigger auto-bump
on:
    push:
        branches:
            - main
            - production  # Add custom branches

# Modify skip conditions
if: |
    !contains(github.event.head_commit.message, '[skip ci]') &&
    !contains(github.event.head_commit.message, '[no bump]')  # Custom skip tag

Commit Type Recognition

To add custom commit types:

# In the "Determine version bump type" step
# Add pattern matching for custom types

# Example: Add 'security' type for patch bumps
if echo "$commit" | grep -qiE "^security(\(.+\))?:"; then
  if [ "$BUMP_TYPE" != "major" ] && [ "$BUMP_TYPE" != "minor" ]; then
    BUMP_TYPE="patch"
  fi
fi

Examples

Example 1: Feature Addition

# Commit
git commit -m "feat: add WebSocket support for real-time compilation"
git push origin main

# Result
# A PR is created: "chore: bump version to 0.13.0"
# After PR is merged:
#   - Version: 0.12.0 → 0.13.0
#   - Changelog: Added "WebSocket support for real-time compilation"
#   - Tag: v0.13.0
#   - Release: Triggered automatically

Example 2: Bug Fix

# Commit
git commit -m "fix: resolve race condition in queue processing"
git push origin main

# Result
# A PR is created: "chore: bump version to 0.13.1"
# After PR is merged:
#   - Version: 0.13.0 → 0.13.1
#   - Changelog: Fixed "race condition in queue processing"
#   - Tag: v0.13.1
#   - Release: Triggered automatically

Example 3: Breaking Change

# Commit
git commit -m "feat!: migrate to async-only API

BREAKING CHANGE: All compilation methods are now async.
Sync methods have been removed. Update your code to use await."
git push origin main

# Result
# A PR is created: "chore: bump version to 1.0.0"
# After PR is merged:
#   - Version: 0.13.1 → 1.0.0
#   - Changelog: Breaking change documented with migration guide
#   - Tag: v1.0.0
#   - Release: Triggered automatically

Example 4: No Version Bump

# Commit
git commit -m "docs: update API documentation"
git push origin main

# Result
# No version bump (docs don't require new version)
# No tag created
# No release triggered

Migration from Manual Bumps

If you're used to manual version bumping:

  1. Stop manually editing version files - Let the workflow handle it
  2. Use conventional commits - Follow the format guidelines
  3. Review auto-generated changelog - Ensure quality commit messages
  4. Use manual workflow for edge cases - When automation isn't suitable
  • VERSION_MANAGEMENT.md - Version synchronization details
  • Conventional Commits - Official specification
  • Semantic Versioning - SemVer specification
  • .github/workflows/version-bump.yml - Consolidated version bump workflow (automatic and manual)
  • .github/workflows/create-version-tag.yml - Tag creation after PR merge
  • .github/workflows/release.yml - Release workflow

Bugs and Feature Requests

This document tracks identified bugs and feature requests for the adblock-compiler project.

Last Updated: 2026-02-11


🐛 Bugs

Critical

BUG-002: No request body size limits

Impact: Potential DoS via large payloads Location: worker/handlers/compile.ts, worker/middleware/index.ts Fix: Add max body size validation (1MB default)

BUG-010: No CSRF protection

Impact: Vulnerability to CSRF attacks Location: Worker POST endpoints Fix: Add CSRF token validation

BUG-012: No SSRF protection for source URLs

Impact: Internal network access via malicious source URLs Location: src/downloader/FilterDownloader.ts Fix: Validate URLs to block private IPs and non-HTTP protocols

High

BUG-001: Direct console.log/console.error usage bypasses logger

Impact: Inconsistent logging Locations:

  • src/diagnostics/DiagnosticsCollector.ts:90-92, 128-130
  • src/utils/EventEmitter.ts
  • src/queue/CloudflareQueueProvider.ts
  • src/services/AnalyticsService.ts Fix: Replace all console.* calls with logger methods

BUG-003: Weak type validation in compile handler

Impact: Invalid data could pass through Location: worker/handlers/compile.ts:85-95 Fix: Use runtime validation before type assertion

BUG-006: Diagnostics events stored only in memory

Impact: Events not exported for analysis Location: src/diagnostics/DiagnosticsCollector.ts Fix: Add event export mechanism

BUG-011: Missing security headers

Impact: Reduced security posture Location: Worker responses Fix: Add X-Content-Type-Options, X-Frame-Options, CSP, HSTS

Medium

BUG-004: Silent error swallowing in FilterService

Impact: Failed downloads return empty strings Location: src/services/FilterService.ts:44 Fix: Let errors propagate or return Result type

BUG-007: No distributed trace ID propagation

Impact: Difficult to correlate logs across async operations Location: Worker handlers Fix: Extract and propagate trace IDs from headers

Low

BUG-005: Database errors not wrapped with custom types

Impact: Inconsistent error handling Location: src/storage/PrismaAdapter.ts, src/storage/D1Adapter.ts Fix: Wrap with StorageError

BUG-008: No public coverage reports

Impact: Unknown test coverage Fix: Add Codecov integration

BUG-009: E2E tests require running server

Impact: Manual test setup required Location: worker/api.e2e.test.ts, worker/websocket.e2e.test.ts Fix: Add test server lifecycle management


🚀 Feature Requests

Critical

FEATURE-001: Add structured JSON logging

Why: Production log aggregation requires structured format Implementation: Add StructuredLogger class with JSON output

FEATURE-004: Add Zod schema validation

Why: Type-safe runtime validation Implementation: Replace manual validation with Zod schemas

FEATURE-006: Centralized error reporting service

Why: Production error tracking (Sentry, Datadog) Implementation: ErrorReporter interface with Sentry/console implementations

FEATURE-008: Add circuit breaker pattern

Why: Prevent cascading failures Implementation: CircuitBreaker class for source downloads

FEATURE-009: Add OpenTelemetry integration

Why: Industry-standard distributed tracing Implementation: OpenTelemetry spans for compilation operations

FEATURE-014: Add rate limiting per endpoint

Why: Different endpoints have different resource costs Implementation: Per-endpoint rate limit configuration

FEATURE-016: Add health check endpoint enhancements

Why: Monitor dependencies, not just uptime Implementation: Health checks for database, cache, sources

FEATURE-021: Add runbook for common operations

Why: Operators need incident procedures Implementation: Create docs/RUNBOOK.md

High

FEATURE-005: Add URL allowlist/blocklist

Why: Prevent SSRF attacks Implementation: Domain-based URL filtering

FEATURE-017: Add metrics export endpoint

Why: Prometheus/Datadog integration Implementation: /metrics endpoint with standard format

Medium

FEATURE-002: Per-module log level configuration

Why: Verbose logging for specific modules Implementation: Module-level log level overrides

FEATURE-007: Add error code documentation

Why: Developers need to understand error codes Implementation: Create docs/ERROR_CODES.md

FEATURE-010: Add performance sampling

Why: Reduce tracing overhead at high volume Implementation: Configurable sampling rate for diagnostics

FEATURE-011: Add request duration histogram

Why: Understand performance distribution Implementation: Record durations in buckets (p50, p95, p99)

FEATURE-013: Add performance benchmarks

Why: Track performance regressions Implementation: Benchmarks for compilation, transformations, cache

FEATURE-015: Add request signing for admin endpoints

Why: Prevent replay attacks Implementation: HMAC-based request signing

FEATURE-019: Add configuration validation on startup

Why: Fail fast with missing environment variables Implementation: Validate required config on startup

FEATURE-020: Add graceful shutdown

Why: Allow in-flight requests to complete Implementation: SIGTERM handler with timeout

FEATURE-022: Add API documentation

Why: External users need API reference Implementation: Generate HTML docs from OpenAPI spec

Low

FEATURE-003: Log file output with rotation

Why: CLI could benefit from file logging Implementation: Optional file appender with size-based rotation

FEATURE-012: Add mutation testing

Why: Verify test effectiveness Implementation: Use Stryker or similar tool

FEATURE-018: Add dashboard for diagnostics

Why: Real-time system visibility Implementation: Web UI for active compilations, errors, cache stats


Quick Reference

By Category

Logging: BUG-001, FEATURE-001, FEATURE-002, FEATURE-003

Validation: BUG-002, BUG-003, FEATURE-004, FEATURE-005, FEATURE-019

Error Handling: BUG-004, BUG-005, FEATURE-006, FEATURE-007, FEATURE-008

Tracing/Diagnostics: BUG-006, BUG-007, FEATURE-009, FEATURE-010, FEATURE-011, FEATURE-018

Security: BUG-010, BUG-011, BUG-012, FEATURE-014, FEATURE-015

Observability: FEATURE-016, FEATURE-017, FEATURE-021

Testing: BUG-008, BUG-009, FEATURE-012, FEATURE-013

Operations: FEATURE-020, FEATURE-022

By Priority

Critical: BUG-002, BUG-010, BUG-012, FEATURE-001, FEATURE-004, FEATURE-006, FEATURE-008, FEATURE-009, FEATURE-014, FEATURE-016, FEATURE-021

High: BUG-001, BUG-003, BUG-006, BUG-011, FEATURE-005, FEATURE-017

Medium: BUG-004, BUG-007, FEATURE-002, FEATURE-007, FEATURE-010, FEATURE-011, FEATURE-013, FEATURE-015, FEATURE-019, FEATURE-020, FEATURE-022

Low: BUG-005, BUG-008, BUG-009, FEATURE-003, FEATURE-012, FEATURE-018


Notes

  • See PRODUCTION_READINESS.md for detailed analysis and implementation guidance
  • All bugs and features include specific file locations and implementation recommendations
  • Priority ratings based on production readiness requirements
  • Estimated total effort: 8-12 weeks for all items

Environment Configuration

This project uses a layered environment configuration system powered by .envrc and direnv.

How It Works

Environment variables are loaded in the following order (later files override earlier ones):

  1. .env - Base configuration shared across all environments (committed to git)
  2. .env.$ENV - Environment-specific configuration (committed to git)
  3. .env.local - Local overrides and secrets (NOT committed to git)

The $ENV variable is automatically determined by your current git branch:

Git BranchEnvironmentLoaded File
mainproduction.env.production
dev or developdevelopment.env.development
Other brancheslocal.env.local
Custom branch with fileCustom.env.$BRANCH_NAME

File Structure

.env                  # Base config (PORT, COMPILER_VERSION, etc.)
.env.development      # Development-specific (test API keys, local DB)
.env.production       # Production-specific (placeholder values)
.env.local            # Your personal secrets (NEVER commit this!)
.env.example          # Template showing all available variables

Setup Instructions

1. Enable direnv (if not already installed)

# macOS
brew install direnv

# Add to your shell config (~/.zshrc)
eval "$(direnv hook zsh)"

2. Allow the .envrc file

direnv allow

You should see: ✅ Loaded environment: development (branch: dev)

3. Create your .env.local file

cp .env.example .env.local

Then edit .env.local with your actual secrets and API keys.

What Goes Where?

.env (Committed)

  • Non-sensitive defaults
  • Port numbers
  • Version numbers
  • Public configuration

.env.development / .env.production (Committed)

  • Environment-specific defaults
  • Test API keys (development only)
  • Environment-specific feature flags
  • Non-secret configuration

.env.local (NOT Committed)

  • ALL secrets and API keys
  • Database connection strings
  • Authentication tokens
  • Personal overrides

Wrangler Integration

The wrangler.toml configuration supports environment-based deployments. Production is the default (top-level) environment; there is no --env production flag:

# Development deployment (uses [env.development] overrides in wrangler.toml)
wrangler deploy --env development

# Production deployment (uses top-level wrangler.toml config — no --env flag needed)
wrangler deploy

Environment variables from .env.local are automatically available during local development (wrangler dev).

For production deployments, secrets should be set using:

wrangler secret put ADMIN_KEY
wrangler secret put TURNSTILE_SECRET_KEY

Troubleshooting

Environment not loading?

# Re-allow the .envrc
direnv allow

# Check what's loaded
direnv exec . env | grep DATABASE_URL

Wrong environment?

Check your git branch:

git branch --show-current

The .envrc automatically maps your branch to an environment.

Variables not available?

Make sure:

  1. You've created .env.local from .env.example
  2. You've run direnv allow
  3. The variable exists in one of the .env files

Security Best Practices

  • DO commit .env, .env.development, .env.production
  • DO use test/dummy values in committed files
  • DO put all secrets in .env.local
  • DON'T commit .env.local
  • ⚠️ BE CAREFUL with .envrc — it is committed as part of the env-loading system, so never put secrets or credentials in it
  • DON'T put real secrets in any committed file
  • DON'T commit production credentials

GitHub Actions Integration

This environment system works seamlessly in GitHub Actions workflows. See ENV_SETUP.md for detailed documentation.

Quick Start

steps:
    - uses: actions/checkout@v4

    - name: Load environment variables
      uses: ./.github/actions/setup-env

    - name: Use environment variables
      run: echo "Version: $COMPILER_VERSION"

The action automatically:

  • Detects environment from branch name
  • Loads .env and .env.$ENV files
  • Exports variables to workflow

Environment Variables Reference

See .env.example for a complete list of available variables and their purposes.

GitHub Issue Templates

This document provides ready-to-use GitHub issue templates for the bugs and features identified in the production readiness assessment.


Critical Bugs

BUG-002: Add request body size limits

Title: Add request body size limits to prevent DoS attacks

Labels: bug, security, priority:critical

Description: Currently, the worker endpoints do not enforce request body size limits, which could allow DoS attacks via large payloads.

Impact:

  • Memory exhaustion
  • Worker crashes
  • Service unavailability

Affected Files:

  • worker/handlers/compile.ts
  • worker/middleware/index.ts

Proposed Solution:

async function validateRequestSize(
    request: Request,
    maxBytes: number = 1024 * 1024,
): Promise<void> {
    const contentLength = request.headers.get('content-length');
    if (contentLength && parseInt(contentLength) > maxBytes) {
        throw new Error(`Request body exceeds ${maxBytes} bytes`);
    }
    // Also enforce during body read for requests without Content-Length
}

Acceptance Criteria:

  • Request body size limited to 1MB by default
  • Configurable via environment variable
  • Returns 413 Payload Too Large when exceeded
  • Tests added for size limit validation

BUG-010: Add CSRF protection

Title: Add CSRF protection to state-changing endpoints

Labels: bug, security, priority:critical

Description: Worker endpoints accept POST requests without CSRF token validation, making them vulnerable to CSRF attacks.

Impact:

  • Unauthorized actions via cross-site requests
  • Security vulnerability

Affected Files:

  • worker/handlers/compile.ts
  • worker/middleware/index.ts

Proposed Solution:

function validateCsrfToken(request: Request): boolean {
    const token = request.headers.get('X-CSRF-Token');
    const cookie = getCookie(request, 'csrf-token');
    return Boolean(token && cookie && token === cookie);
}

Acceptance Criteria:

  • CSRF token validation middleware created
  • Applied to all POST/PUT/DELETE endpoints
  • Token generation endpoint added
  • Tests added for CSRF validation
  • Documentation updated

BUG-012: Add SSRF protection for source URLs

Title: Prevent SSRF attacks via malicious source URLs

Labels: bug, security, priority:critical

Description: The FilterDownloader fetches arbitrary URLs without validation, allowing potential SSRF attacks to access internal networks.

Impact:

  • Access to internal network resources
  • Potential data exposure
  • Security vulnerability

Affected Files:

  • src/downloader/FilterDownloader.ts
  • src/platform/HttpFetcher.ts

Proposed Solution:

function isSafeUrl(url: string): boolean {
    const parsed = new URL(url);

    // Block private IPs
    if (
        parsed.hostname === 'localhost' ||
        parsed.hostname.startsWith('127.') ||
        parsed.hostname.startsWith('192.168.') ||
        parsed.hostname.startsWith('10.') ||
        /^172\.(1[6-9]|2[0-9]|3[0-1])\./.test(parsed.hostname)
    ) {
        return false;
    }

    // Only allow http/https
    if (!['http:', 'https:'].includes(parsed.protocol)) {
        return false;
    }

    return true;
}

Acceptance Criteria:

  • URL validation function created
  • Blocks localhost, private IPs, link-local addresses
  • Only allows HTTP/HTTPS protocols
  • Tests added for URL validation
  • Error handling for blocked URLs
  • Documentation updated

Critical Features

FEATURE-001: Add structured JSON logging

Title: Implement structured JSON logging for production observability

Labels: enhancement, observability, priority:critical

Description: Current logging outputs human-readable text which is difficult to parse in production log aggregation systems. Need structured JSON format.

Why: Production log aggregation systems (CloudWatch, Datadog, Splunk) require structured logs for:

  • Filtering and searching
  • Alerting on specific conditions
  • Analytics and dashboards

Affected Files:

  • src/utils/logger.ts
  • src/types/index.ts

Proposed Implementation:

interface StructuredLog {
    timestamp: string;
    level: LogLevel;
    message: string;
    context?: Record<string, unknown>;
    correlationId?: string;
    traceId?: string;
}

class StructuredLogger extends Logger {
    log(level: LogLevel, message: string, context?: Record<string, unknown>) {
        const entry: StructuredLog = {
            timestamp: new Date().toISOString(),
            level,
            message,
            context,
            correlationId: this.correlationId,
        };
        console.log(JSON.stringify(entry));
    }
}

Acceptance Criteria:

  • StructuredLogger class created
  • JSON output format implemented
  • Backward compatible with existing Logger
  • Configuration option to enable JSON mode
  • Tests added for structured logging
  • Documentation updated

FEATURE-004: Add Zod schema validation

Title: Replace manual validation with Zod schema validation

Labels: enhancement, validation, priority:critical

Description: Current manual validation is error-prone and lacks type safety. Zod provides runtime validation with TypeScript integration.

Why:

  • Type-safe validation
  • Better error messages
  • Reduced boilerplate
  • Maintained by community

Affected Files:

  • src/configuration/ConfigurationValidator.ts
  • worker/handlers/compile.ts
  • deno.json (add dependency)

Proposed Implementation:

import { z } from "https://deno.land/x/zod/mod.ts";

const SourceSchema = z.object({
    source: z.string().url(),
    name: z.string().optional(),
    type: z.enum(['adblock', 'hosts']).optional(),
});

const ConfigurationSchema = z.object({
    name: z.string().min(1),
    description: z.string().optional(),
    sources: z.array(SourceSchema).nonempty(),
    transformations: z.array(z.nativeEnum(TransformationType)).optional(),
    exclusions: z.array(z.string()).optional(),
    inclusions: z.array(z.string()).optional(),
});

Acceptance Criteria:

  • Zod dependency added to deno.json
  • ConfigurationSchema created
  • ConfigurationValidator refactored to use Zod
  • Request body schemas added to handlers
  • Error messages match or improve on current format
  • All tests passing
  • Documentation updated

FEATURE-006: Add centralized error reporting service

Title: Implement centralized error reporting for production monitoring

Labels: enhancement, observability, priority:critical

Description: Errors are currently only logged locally. Need centralized error reporting to tracking services like Sentry or Datadog.

Why:

  • Aggregate errors across all instances
  • Alert on error rate increases
  • Track error trends
  • Capture stack traces and context
  • Monitor production health

Affected Files:

  • Create src/utils/ErrorReporter.ts
  • Update all try/catch blocks

Proposed Implementation:

interface ErrorReporter {
    report(error: Error, context?: Record<string, unknown>): void;
}

class SentryErrorReporter implements ErrorReporter {
    constructor(private dsn: string) {}

    report(error: Error, context?: Record<string, unknown>): void {
        // Send to Sentry with context
    }
}

class ConsoleErrorReporter implements ErrorReporter {
    report(error: Error, context?: Record<string, unknown>): void {
        console.error(ErrorUtils.format(error), context);
    }
}

Acceptance Criteria:

  • ErrorReporter interface created
  • SentryErrorReporter implementation
  • ConsoleErrorReporter implementation
  • Integration points added to catch blocks
  • Configuration via environment variable
  • Tests added
  • Documentation updated

FEATURE-008: Implement circuit breaker pattern

Title: Add circuit breaker for unreliable source downloads

Labels: enhancement, resilience, priority:critical

Description: When filter list sources are consistently failing, we continue retrying them, wasting resources. Circuit breaker prevents cascading failures.

Why:

  • Prevent resource waste on failing sources
  • Fail fast for known-bad sources
  • Automatic recovery attempt after timeout
  • Improve overall system resilience

Affected Files:

  • Create src/utils/CircuitBreaker.ts
  • src/downloader/FilterDownloader.ts

Proposed Implementation:

class CircuitBreaker {
    private failureCount = 0;
    private state: "CLOSED" | "OPEN" | "HALF_OPEN" = "CLOSED";
    private lastFailureTime?: Date;

    constructor(
        private threshold: number = 5,
        private timeout: number = 60000,
    ) {}

    async execute<T>(fn: () => Promise<T>): Promise<T> {
        if (this.state === 'OPEN') {
            if (Date.now() - this.lastFailureTime!.getTime() > this.timeout) {
                this.state = 'HALF_OPEN';
            } else {
                throw new Error('Circuit breaker is OPEN');
            }
        }

        try {
            const result = await fn();
            this.onSuccess();
            return result;
        } catch (error) {
            this.onFailure();
            throw error;
        }
    }
}

Acceptance Criteria:

  • CircuitBreaker class created
  • States: CLOSED, OPEN, HALF_OPEN
  • Configurable failure threshold and timeout
  • Integration with FilterDownloader
  • Status monitoring endpoint
  • Tests added for all states
  • Documentation updated

FEATURE-009: Add OpenTelemetry integration

Title: Implement OpenTelemetry for distributed tracing

Labels: enhancement, observability, priority:critical

Description: Current tracing system is custom and not compatible with standard observability platforms. OpenTelemetry is industry standard.

Why:

  • Compatible with all major platforms (Datadog, Honeycomb, Jaeger)
  • Distributed tracing across services
  • Standard instrumentation
  • Rich ecosystem of integrations

Affected Files:

  • Create src/diagnostics/OpenTelemetryExporter.ts
  • src/compiler/SourceCompiler.ts
  • worker/worker.ts
  • deno.json (add dependency)

Proposed Implementation:

import { SpanStatusCode, trace } from "@opentelemetry/api";

const tracer = trace.getTracer('adblock-compiler', VERSION);

async function compileWithTracing(config: IConfiguration): Promise<string> {
    return tracer.startActiveSpan('compile', async (span) => {
        try {
            span.setAttribute('config.name', config.name);
            span.setAttribute('config.sources.count', config.sources.length);

            const result = await compile(config);

            span.setStatus({ code: SpanStatusCode.OK });
            return result;
        } catch (error) {
            span.recordException(error);
            span.setStatus({ code: SpanStatusCode.ERROR });
            throw error;
        } finally {
            span.end();
        }
    });
}

Acceptance Criteria:

  • OpenTelemetry dependencies added
  • Tracer configuration
  • Spans added to compilation operations
  • Integration with existing tracing context
  • Exporter configuration (OTLP, console)
  • Tests added
  • Documentation updated

Medium Priority Examples

FEATURE-002: Per-module log level configuration

Title: Add per-module log level configuration

Labels: enhancement, observability, priority:medium

Description: Currently log level is global. Need ability to set different log levels for different modules during debugging.

Example:

const logger = new Logger({
    defaultLevel: LogLevel.Info,
    moduleOverrides: {
        "compiler": LogLevel.Debug,
        "downloader": LogLevel.Trace,
    },
});

Acceptance Criteria:

  • LoggerConfig interface with moduleOverrides
  • Logger respects module-specific levels
  • Configuration via environment variables
  • Tests added
  • Documentation updated

BUG-004: Fix silent error swallowing in FilterService

Title: FilterService should not silently swallow download errors

Labels: bug, error-handling, priority:medium

Description: FilterService.downloadSource() catches errors and returns empty string, making it impossible for callers to know if download failed.

Location: src/services/FilterService.ts:44

Current Code:

try {
    const content = await this.downloader.download(source);
    return content;
} catch (error) {
    this.logger.error(`Failed to download source: ${source}`, error);
    return ''; // Silent failure
}

Proposed Solutions:

Option 1: Let error propagate

throw ErrorUtils.wrap(error, `Failed to download source: ${source}`);

Option 2: Return Result type

return { success: false, error: ErrorUtils.getMessage(error) };

Acceptance Criteria:

  • Choose and implement solution
  • Update callers to handle errors
  • Tests added for error cases
  • Documentation updated

Summary Statistics

Total Items: 22 (12 bugs + 10 features shown as examples)

By Priority:

  • Critical: 12 items
  • High: 7 items
  • Medium: 10 items
  • Low: 5 items

By Category:

  • Security: 5 items
  • Observability: 8 items
  • Validation: 4 items
  • Error Handling: 4 items
  • Testing: 3 items
  • Operations: 3 items

Estimated Effort: 8-12 weeks for all items


Creating Issues

To create issues from these templates:

  1. Copy the relevant template above
  2. Create new issue in GitHub
  3. Paste template content
  4. Add appropriate labels
  5. Assign to milestone if applicable
  6. Link related issues

Bulk Creation Script

For bulk issue creation, consider using GitHub CLI:

# Example for BUG-002
gh issue create \
  --title "Add request body size limits to prevent DoS attacks" \
  --body-file issue-templates/BUG-002.md \
  --label "bug,security,priority:critical"

See BUGS_AND_FEATURES.md for quick reference list and PRODUCTION_READINESS.md for detailed analysis.

CLAUDE.md - AI Assistant Guide

This document provides essential context for AI assistants working with the adblock-compiler codebase.

Project Overview

AdBlock Compiler is a Compiler-as-a-Service for adblock filter lists. It transforms, optimizes, and combines filter lists from multiple sources with real-time progress tracking.

  • Version: 0.7.12
  • Runtime: Deno 2.4+ (primary), Node.js compatible, Cloudflare Workers compatible
  • Language: TypeScript (strict mode, 100% type-safe)
  • License: GPL-3.0
  • JSR Package: @jk-com/adblock-compiler

Quick Commands

# Development
deno task dev              # Development with watch mode
deno task compile          # Run compiler CLI

# Testing
deno task test             # Run all tests
deno task test:watch       # Tests in watch mode
deno task test:coverage    # Generate coverage reports

# Code Quality
deno task lint             # Lint code
deno task fmt              # Format code
deno task fmt:check        # Check formatting
deno task check            # Type check

# Build & Deploy
deno task build            # Build standalone executable
deno task wrangler:dev     # Run wrangler dev server (port 8787)
deno task wrangler:deploy  # Deploy to Cloudflare Workers

# Benchmarks
deno task bench            # Run performance benchmarks

Project Structure

src/
├── cli/                   # CLI implementation (ArgumentParser, ConfigurationLoader)
├── compiler/              # Core compilation (FilterCompiler, SourceCompiler)
├── configuration/         # Config validation (pure TypeScript, no AJV)
├── transformations/       # 11 rule transformations (see below)
├── downloader/            # Content fetching & preprocessing
├── platform/              # Platform abstraction (Workers, Deno, Node.js)
├── storage/               # Caching & health monitoring
├── filters/               # Rule filtering utilities
├── utils/                 # Utilities (RuleUtils, Wildcard, TldUtils, etc.)
├── types/                 # TypeScript interfaces (IConfiguration, ISource)
├── index.ts               # Library exports
├── mod.ts                 # Deno module exports
└── cli.deno.ts            # Deno CLI entry point

worker/
├── worker.ts              # Cloudflare Worker (main API handler)
└── html.ts                # HTML templates

public/                    # Static web UI assets
examples/                  # Example filter list configurations
docs/                      # Additional documentation

Architecture Patterns

The codebase uses these key patterns:

  • Strategy Pattern: Transformations (SyncTransformation, AsyncTransformation)
  • Builder Pattern: TransformationPipeline construction
  • Factory Pattern: TransformationRegistry
  • Composite Pattern: CompositeFetcher for chaining fetchers
  • Adapter Pattern: Platform abstraction layer

Two Compiler Classes

  1. FilterCompiler (src/compiler/) - File system-based, for Deno/Node.js CLI
  2. WorkerCompiler (src/platform/) - Platform-agnostic, for Workers/browsers

Transformation System

11 available transformations applied in order:

  1. ConvertToAscii - Non-ASCII to Punycode
  2. RemoveComments - Remove ! and # comment lines
  3. Compress - Hosts to adblock syntax conversion
  4. RemoveModifiers - Strip unsupported modifiers
  5. Validate - Remove dangerous/incompatible rules
  6. ValidateAllowIp - Like Validate but keeps IPs
  7. Deduplicate - Remove duplicate rules
  8. InvertAllow - Convert blocks to allow rules
  9. RemoveEmptyLines - Remove blank lines
  10. TrimLines - Remove leading/trailing whitespace
  11. InsertFinalNewLine - Add final newline

All transformations extend SyncTransformation or AsyncTransformation base classes in src/transformations/base/.

Code Conventions

Naming

  • Classes: PascalCase (FilterCompiler, RemoveCommentsTransformation)
  • Functions/methods: camelCase (executeSync, validate)
  • Constants: UPPER_SNAKE_CASE (CACHE_TTL, RATE_LIMIT_MAX_REQUESTS)
  • Interfaces: I-prefixed (IConfiguration, ILogger, ISource)
  • Enums: PascalCase (TransformationType, SourceType)

File Organization

  • Each module in its own directory with index.ts exports
  • Tests co-located as *.test.ts next to source files
  • No deeply nested directory structures

TypeScript

  • Strict mode enabled (all strict options)
  • No implicit any
  • Explicit return types on public methods
  • Use interfaces over type aliases for object shapes

Error Handling

  • Custom error types for specific scenarios
  • Validation results over exceptions where possible
  • Retry logic with exponential backoff for network operations

Testing

Tests use Deno's native testing framework:

# Run all tests
deno test --allow-read --allow-write --allow-net --allow-env

# Run specific test file
deno test src/utils/RuleUtils.test.ts --allow-read

# Run with coverage
deno task test:coverage

Test file conventions:

  • Co-located with source: FileName.ts -> FileName.test.ts
  • Use Deno.test() with descriptive names
  • Mock external dependencies (network, file system)

Configuration Schema

interface IConfiguration {
    name: string; // Required
    description?: string;
    homepage?: string;
    license?: string;
    version?: string;
    sources: ISource[]; // Required, non-empty
    transformations?: TransformationType[];
    exclusions?: string[]; // Patterns to exclude
    inclusions?: string[]; // Patterns to include
}

interface ISource {
    source: string; // URL or file path
    name?: string;
    type?: 'adblock' | 'hosts';
    transformations?: TransformationType[];
    exclusions?: string[];
    inclusions?: string[];
}

Pattern types: plain string (contains), *.wildcard, /regex/

API Endpoints (Worker)

  • POST /compile - JSON compilation API
  • POST /compile/stream - Streaming with SSE
  • POST /compile/batch - Batch up to 10 lists
  • POST /compile/async - Queue-based async compilation
  • POST /compile/batch/async - Queue-based batch compilation
  • GET /metrics - Performance metrics
  • GET / - Interactive web UI

Key Files to Know

FilePurpose
src/compiler/FilterCompiler.tsMain compilation logic
src/platform/WorkerCompiler.tsPlatform-agnostic compiler
src/transformations/TransformationRegistry.tsTransformation management
src/configuration/ConfigurationValidator.tsConfig validation
src/downloader/FilterDownloader.tsContent fetching with retries
src/types/index.tsCore type definitions
worker/worker.tsCloudflare Worker API handler
deno.jsonDeno tasks and configuration
wrangler.tomlCloudflare Workers config

Platform Support

The codebase supports multiple runtimes through the platform abstraction layer:

  • Deno (primary) - Full file system access
  • Node.js - npm-compatible via package.json
  • Cloudflare Workers - No file system, HTTP-only
  • Web Workers - Browser background threads

Use FilterCompiler for CLI/server environments, WorkerCompiler for edge/browser.

Dependencies

Minimal external dependencies:

  • @luca/cases (JSR) - String case conversion
  • @std/* (Deno Standard Library) - Core utilities
  • tldts (npm) - TLD/domain parsing
  • wrangler (dev) - Cloudflare deployment

Common Tasks

Adding a New Transformation

  1. Create src/transformations/MyTransformation.ts
  2. Extend SyncTransformation or AsyncTransformation
  3. Implement execute(lines: string[]): string[]
  4. Register in TransformationRegistry.ts
  5. Add to TransformationType enum in src/types/index.ts
  6. Write co-located tests

Modifying the API

  1. Edit worker/worker.ts
  2. Update route handlers
  3. Test with deno task wrangler:dev
  4. Deploy with deno task wrangler:deploy

Adding CLI Options

  1. Add to ParsedArguments interface in src/cli/ArgumentParser.ts
  2. Update parseArgs() in src/cli/ArgumentParser.ts (add to boolean, string, or collect arrays)
  3. Add to ICliArgs interface in src/cli/CliApp.deno.ts
  4. Update parseArgs() in src/cli/CliApp.deno.ts
  5. Handle the new flag in buildTransformations(), createConfig(), readConfig(), or run() as appropriate
  6. Add the field to CliArgumentsOutput type and CliArgumentsSchema in src/configuration/schemas.ts
  7. Update showHelp() in both ArgumentParser.ts and CliApp.deno.ts
  8. Update docs/usage/CLI.md

CI/CD Pipeline

GitHub Actions workflow (.github/workflows/ci.yml):

  1. Test: Run all tests with coverage
  2. Type Check: Full TypeScript validation
  3. Security: Trivy vulnerability scanning
  4. JSR Publish: Auto-publish on master push
  5. Worker Deploy: Deploy to Cloudflare Workers
  6. Pages Deploy: Deploy static assets

Environment Variables

See .env.example for available options:

  • PORT - Server port (default: 8787)
  • DENO_DIR - Deno cache directory
  • Cloudflare bindings configured in wrangler.toml

Version Management

This document describes how version strings are managed across the adblock-compiler project to ensure consistency and prevent version drift.

Single Source of Truth

src/version.ts is the canonical source for the package version.

export const VERSION = '0.12.0';

Version Synchronization

All version strings flow from src/version.ts:

1. Package Metadata

src/version.ts is the only writable version file. All other files are synced from it automatically by the scripts/sync-version.ts script:

# After editing src/version.ts, propagate to all other files:
deno task version:sync

The following files are read-only (do not edit their version strings directly):

  • deno.json - Synced by version:sync (required for JSR publishing)
  • package.json - Synced by version:sync (required for npm compatibility)
  • package-lock.json - not modified by version:sync; it is updated automatically by npm when npm install is run after package.json has been synced
  • wrangler.toml - Synced by version:sync (COMPILER_VERSION env var)

2. Worker Code (Automatic)

Worker code imports and uses VERSION as a fallback:

  • worker/worker.ts - Imports VERSION, uses env.COMPILER_VERSION || VERSION
  • worker/router.ts - Imports VERSION, uses env.COMPILER_VERSION || VERSION
  • worker/websocket.ts - Imports VERSION, uses env.COMPILER_VERSION || VERSION

This ensures that even if COMPILER_VERSION is not set in the environment, the worker will use the correct version from src/version.ts.

3. Web UI (Dynamic Loading)

HTML files load version dynamically from the API at runtime:

  • public/index.html - Calls /api/version endpoint via loadVersion()
  • public/compiler.html - Calls /api/version and /api endpoints via fetchCompilerVersion()

Fallback HTML values are provided for offline/error scenarios but are always overridden by the API response.

4. Tests

Test files import VERSION for consistency:

  • worker/queue.integration.test.ts - Uses VERSION + '-test'

Version Update Process

The project uses automatic version bumping based on Conventional Commits:

  • Automatic: Version is bumped automatically when you merge PRs with proper commit messages
  • No manual editing: Version files are updated automatically
  • Changelog generation: CHANGELOG.md is updated automatically
  • Release creation: GitHub releases are created automatically

See AUTO_VERSION_BUMP.md for complete details.

Quick Guide:

# Minor bump (new feature)
git commit -m "feat: add new transformation"

# Patch bump (bug fix)
git commit -m "fix: resolve parsing error"

# Major bump (breaking change)
git commit -m "feat!: change API interface"

Manual (Fallback)

If you need to manually bump the version:

  1. ✅ Update src/version.ts - Change the VERSION constant (only writable source)
  2. ✅ Run deno task version:sync - Propagates to deno.json, package.json, wrangler.toml, and HTML fallback spans in public/index.html and public/compiler.html
  3. ✅ Update CHANGELOG.md - Document the changes
  4. ✅ Commit with message: chore: bump version to X.Y.Z [skip ci]

Or use the GitHub Actions workflow: Actions → Version Bump → Run workflow

Architecture Benefits

Before (Version Drift Problem)

  • Multiple hardcoded version strings scattered across the codebase
  • Easy to forget updating some locations
  • Version drift between components (e.g., 0.11.3, 0.11.4, 0.11.5, 0.12.0 all present)

After (Single Source of Truth)

  • One canonical writable source: src/version.ts
  • All other version files (deno.json, package.json, wrangler.toml) are read-only – synced via deno task version:sync
  • Worker imports and uses it automatically
  • Web UI loads it dynamically from API
  • CI/CD version-bump workflow updates only src/version.ts then runs the sync script

Version Flow Diagram

src/version.ts (VERSION = '0.12.0')
    ↓
    ├─→ worker/worker.ts (import VERSION)
    │   └─→ API endpoints (/api, /api/version)
    │       └─→ public/index.html (loadVersion())
    │       └─→ public/compiler.html (fetchCompilerVersion())
    │
    ├─→ worker/router.ts (import VERSION)
    ├─→ worker/websocket.ts (import VERSION)
    └─→ worker/queue.integration.test.ts (import VERSION)

Implementation Details

Worker Fallback Pattern

All worker files use this pattern:

import { VERSION } from '../src/version.ts';

// Later in code:
version: env.COMPILER_VERSION || VERSION;

This ensures:

  1. Production uses COMPILER_VERSION from wrangler.toml
  2. Local dev/tests use VERSION from src/version.ts if env var missing
  3. No "unknown" versions

Dynamic Loading in HTML

Both HTML files fetch version at page load:

async function loadVersion() {
    const response = await fetch('/api/version');
    const result = await response.json();
    const version = result.data?.version || result.version;
    document.getElementById('version').textContent = version;
}

This ensures:

  1. Version always matches deployed worker
  2. No manual HTML updates needed
  3. Fallback version only shown on API failure

Troubleshooting

Version shows as "unknown"

  • Check that COMPILER_VERSION is set in wrangler.toml
  • Verify worker files import VERSION from src/version.ts
  • Ensure fallback pattern env.COMPILER_VERSION || VERSION is used

Version shows old value in UI

  • Check browser cache - hard refresh (Ctrl+F5)
  • Verify API endpoint /api/version returns correct version
  • Check that JavaScript loadVersion() function is being called

Versions out of sync

  • Check src/version.ts is the intended version
  • Run deno task version:sync to propagate to all other files
  • Use grep to find any remaining hardcoded version strings:
    grep -r "0\.11\." --include="*.ts" --include="*.html" --include="*.toml"
    
  • src/version.ts - Primary version definition
  • deno.json - Package version
  • package.json - Package version
  • wrangler.toml - Worker environment variable
  • public/index.html - HTML fallback version span (auto-synced by version:sync)
  • public/compiler.html - HTML fallback version spans (auto-synced by version:sync)
  • CHANGELOG.md - Version history
  • .github/copilot-instructions.md - Contains version sync instructions for AI assistance

Release Notes

Release notes, changelogs, and announcements for Adblock Compiler versions.

Contents

  • CHANGELOG - Full version history and release notes

Version 0.8.0 Release Summary

🎉 Major Release - Admin Dashboard & Enhanced User Experience

This release represents a significant milestone in making Adblock Compiler a professional, user-friendly platform that showcases the power and versatility of the compiler-as-a-service model.


🌟 Highlights

Admin Dashboard - Your Command Center

The new admin dashboard (/) is now the landing page that provides:

  • 📊 Real-time Metrics - Live monitoring of requests, queue depth, cache performance, and response times
  • 🎯 Smart Navigation - Quick access to all tools (Compiler, Tests, E2E, WebSocket Demo, API Docs)
  • 📈 Queue Visualization - Beautiful Chart.js graphs showing queue depth over time
  • 🔔 Async Notifications - Browser notifications when compilation jobs complete
  • 🧪 Interactive API Tester - Test API endpoints directly from the dashboard
  • ⚡ Quick Actions - One-click access to metrics, stats, and documentation

Key Features

1. Real-time Monitoring

The dashboard displays four critical metrics that auto-refresh every 30 seconds:

┌─────────────────┬─────────────────┬─────────────────┬─────────────────┐
│ Total Requests  │  Queue Depth    │ Cache Hit Rate  │ Avg Response    │
│     1,234       │       5         │     87%         │     245ms       │
└─────────────────┴─────────────────┴─────────────────┴─────────────────┘

2. Notification System

Browser/OS Notifications:

  • Get notified when async compilation jobs complete
  • Works across browser tabs and even when minimized
  • Persistent tracking via LocalStorage

In-Page Toasts:

  • Success (Green) - Job completed
  • Error (Red) - Job failed
  • Warning (Yellow) - Important updates
  • Info (Blue) - General notifications

Smart Features:

  • Debounced localStorage updates for performance
  • Automatic cleanup of old jobs (1 hour retention)
  • Stops polling when no jobs are tracked (saves resources)

3. Interactive API Tester

Test API endpoints without leaving the dashboard:

  • GET /api - API information
  • GET /metrics - Performance metrics
  • GET /queue/stats - Queue statistics
  • POST /compile - Compile filter lists

Features:

  • Pre-configured example requests
  • JSON syntax validation
  • Response display with status codes
  • Success/error notifications
  • Reset functionality

4. Educational Content

The dashboard teaches users about the platform:

WebSocket vs SSE vs Queue:

POST /compile         → Simple JSON response
POST /compile/stream  → SSE progress updates
GET /ws/compile       → WebSocket bidirectional
POST /compile/async   → Queue for background

When to Use WebSocket:

  • Full-duplex communication needed
  • Lower latency is critical
  • Send data both ways (client ↔ server)
  • Interactive applications requiring instant feedback

📂 Project Organization

Root Directory Cleanup

Before:

.
├── CODE_REVIEW.old.md         ❌ Removed (outdated)
├── REVIEW_SUMMARY.md          ❌ Removed (outdated)
├── coverage.lcov              ❌ Removed (build artifact)
├── postman-collection.json    ❌ Moved to docs/tools/
├── postman-environment.json   ❌ Moved to docs/tools/
├── prisma.config.ts           ❌ Moved to prisma/
└── ... (other files)

After:

.
├── CHANGELOG.md              ✅ Updated for v0.8.0
├── README.md                 ✅ Enhanced with v0.8.0 features
├── deno.json                 ✅ Version 0.8.0
├── package.json              ✅ Version 0.8.0
├── docs/
│   ├── ADMIN_DASHBOARD.md    ✅ New comprehensive guide
│   ├── tools/
│   │   ├── postman-collection.json
│   │   └── postman-environment.json
│   └── ... (other docs)
├── prisma/
│   └── prisma.config.ts      ✅ Moved from root
├── public/
│   ├── index.html            ✅ New admin dashboard
│   ├── compiler.html         ✅ Renamed from index.html
│   ├── test.html
│   ├── e2e-tests.html
│   └── websocket-test.html
└── src/
    └── version.ts            ✅ Version 0.8.0

🎨 User Experience Enhancements

Professional Design

  • Modern gradient backgrounds
  • Card-based navigation with hover effects
  • Responsive design (mobile-friendly)
  • High-contrast colors for accessibility
  • Smooth animations and transitions

Intuitive Navigation

Dashboard (/)
├── 🔧 Compiler UI (/compiler.html)
├── 🧪 API Test Suite (/test.html)
├── 🔬 E2E Tests (/e2e-tests.html)
├── 🔌 WebSocket Demo (/websocket-test.html)
├── 📖 API Documentation (/docs/api/index.html)
└── 📊 Metrics & Stats

Smart Features

  1. Auto-refresh - Metrics update every 30 seconds
  2. Job monitoring - Polls every 10 seconds when tracking jobs
  3. Efficient polling - Stops when no jobs to track
  4. Debounced saves - Reduces localStorage writes
  5. Error recovery - Graceful degradation on failures

📚 Documentation

New Documentation

  • docs/ADMIN_DASHBOARD.md - Complete dashboard guide
    • Overview of all features
    • Notification system documentation
    • API tester usage
    • Customization options
    • Browser compatibility
    • Performance considerations

Updated Documentation

  • README.md - Highlights v0.8.0 features prominently
  • CHANGELOG.md - Comprehensive release notes
  • docs/POSTMAN_TESTING.md - Updated file paths
  • docs/api/QUICK_REFERENCE.md - Updated file paths
  • docs/OPENAPI_TOOLING.md - Updated file paths

🔧 Technical Improvements

Code Quality

State Management:

// Before: Global variables
let queueChart = null;
let notificationsEnabled = false;
let trackedJobs = new Map();

// After: Encapsulated state
const DashboardState = {
    queueChart: null,
    notificationsEnabled: false,
    trackedJobs: new Map(),
    jobMonitorInterval: null,
    saveTrackedJobs: /* debounced function */
};

Performance Optimizations:

  • Debounced localStorage updates (1 second)
  • Smart interval management (stops when idle)
  • Efficient Map serialization
  • Lazy chart initialization

Security:

  • No use of eval() or Function constructor
  • Input validation for JSON
  • CORS properly configured
  • No sensitive data exposed

🚀 Deployment

The admin dashboard is production-ready and deployed to:

Live URL: https://adblock-compiler.jayson-knight.workers.dev/

Features:

  • Cloudflare Workers edge deployment
  • Global CDN distribution
  • KV storage for caching
  • Rate limiting (10 req/min)
  • Optional Turnstile bot protection

📊 Metrics

File Changes

Files Changed:    20
Insertions:     +3,200 lines
Deletions:      -1,100 lines
Net Change:     +2,100 lines

New Features

  • ✅ Admin Dashboard
  • ✅ Notification System
  • ✅ Interactive API Tester
  • ✅ Queue Visualization
  • ✅ Educational Content
  • ✅ Documentation Hub

🎯 User Benefits

Before v0.8.0

Users had to:

  • Navigate directly to compiler UI
  • Manually check queue stats
  • Use external tools to test API
  • Switch between multiple pages for docs

After v0.8.0

Users can:

  • ✅ See everything at a glance from dashboard
  • ✅ Monitor metrics in real-time
  • ✅ Get notified when jobs complete
  • ✅ Test API directly from browser
  • ✅ Learn about features through UI
  • ✅ Navigate quickly between tools

🏆 Achievement Unlocked

This release demonstrates:

  • Professional Quality - Production-ready UI/UX
  • User-Centric Design - Intuitive and helpful
  • Performance - Efficient resource usage
  • Documentation - Comprehensive guides
  • Accessibility - Responsive and inclusive
  • Innovation - Novel notification system

🔮 Future Enhancements

Potential additions in future releases:

  • Dark mode toggle
  • Customizable refresh intervals
  • Historical metrics graphs (week/month view)
  • Job scheduling interface
  • Filter list library management
  • User authentication for admin features
  • Export metrics to CSV/JSON
  • Advanced queue analytics

🙏 Credits

Developed by: Jayson Knight
Package: @jk-com/adblock-compiler
Repository: https://github.com/jaypatrick/adblock-compiler
License: GPL-3.0

Based on: @adguard/hostlist-compiler


📝 Summary

Version 0.8.0 transforms Adblock Compiler from a simple compilation tool into a comprehensive, professional platform. The new admin dashboard showcases the power of the software while making it incredibly easy to use. With real-time monitoring, async notifications, and an interactive API tester, users can manage their filter list compilations with confidence and ease.

This release shows users just how cool this software really is! 🎉

Introducing Adblock Compiler: A Compiler-as-a-Service for Filter Lists

Published: 2026

Combining filter lists from multiple sources shouldn't be complex. Whether you're managing a DNS blocker, ad blocker, or content filtering system, the ability to merge, validate, and optimize rules is essential. Today, we're excited to introduce Adblock Compiler—a modern, production-ready solution for transforming and compiling filter lists at scale.

What is Adblock Compiler?

Adblock Compiler is a powerful Compiler-as-a-Service package (v0.11.4) that simplifies the creation and management of filter lists. It's a Deno-native rewrite of the original @adguard/hostlist-compiler, offering improved performance, no Node.js dependencies, and support for modern edge platforms.

At its core, Adblock Compiler does one thing exceptionally well: it transforms, optimizes, and combines adblock filter lists from multiple sources into production-ready blocklists.

flowchart TD
    SRC["Multiple Filter Sources<br/>(URLs, files, inline rules - multiple formats supported)"]
    subgraph PIPE["Adblock Compiler Pipeline"]
        direction TB
        P1["1. Parse and normalize rules"]
        P2["2. Apply transformations (11 different types)"]
        P3["3. Remove duplicates and invalid rules"]
        P4["4. Validate for compatibility"]
        P5["5. Compress and optimize"]
        P1 --> P2 --> P3 --> P4 --> P5
    end
    SRC --> P1
    P5 --> OUT["Output in Multiple Formats<br/>(Adblock, Hosts, Dnsmasq, Pi-hole, Unbound, DoH, JSON)"]

Why Adblock Compiler?

Managing filter lists manually is tedious and error-prone. You need to:

  • Combine lists from multiple sources and maintainers
  • Handle different formats (adblock syntax, /etc/hosts, etc.)
  • Remove duplicates while maintaining performance
  • Validate rules for your specific platform
  • Optimize for cache and memory
  • Automate updates and deployments

Adblock Compiler handles all of this automatically.

Key Features

1. 🎯 Multi-Source Compilation

Merge filter lists from any combination of sources:

{
  "name": "My Custom Blocklist",
  "sources": [
    {
      "source": "https://adguardteam.github.io/AdGuardSDNSFilter/Filters/filter.txt",
      "type": "adblock",
      "transformations": ["RemoveComments", "Validate"]
    },
    {
      "source": "/etc/hosts.local",
      "type": "hosts",
      "transformations": ["Compress"]
    },
    {
      "source": "https://example.com/custom-rules.txt",
      "exclusions": ["whitelist.example.com"]
    }
  ],
  "transformations": ["Deduplicate", "RemoveEmptyLines"]
}

2. ⚡ Performance & Optimization

Adblock Compiler delivers impressive performance metrics:

  • Gzip compression: 70-80% cache size reduction
  • Smart deduplication: Removes redundant rules while preserving order
  • Request deduplication: Avoids fetching the same source twice
  • Intelligent caching: Detects changes and rebuilds only when needed
  • Batch processing: Compile up to 10 lists in parallel

3. 🔄 11 Built-in Transformations

Transform and clean your filter lists with a comprehensive suite:

  1. ConvertToAscii - Convert internationalized domains (IDN) to ASCII
  2. RemoveComments - Strip comment lines (! and # prefixes)
  3. Compress - Convert hosts→adblock syntax, remove redundancies
  4. RemoveModifiers - Remove unsupported rule modifiers for DNS blockers
  5. Validate - Remove invalid/incompatible rules for DNS blockers
  6. ValidateAllowIp - Like Validate, but preserves IP addresses
  7. Deduplicate - Remove duplicates while preserving order
  8. InvertAllow - Convert blocking rules to whitelist rules
  9. RemoveEmptyLines - Clean up empty lines
  10. TrimLines - Remove leading/trailing whitespace
  11. InsertFinalNewLine - Ensure proper file termination

Important: Transformations always execute in this specific order, ensuring predictable results.

4. 🌐 Platform Support

Adblock Compiler runs everywhere:

flowchart TD
    PAL["Platform Abstraction Layer"]
    PAL --> D["✓ Deno (native)"]
    PAL --> N["✓ Node.js (npm compatibility)"]
    PAL --> CF["✓ Cloudflare Workers"]
    PAL --> DD["✓ Deno Deploy"]
    PAL --> VE["✓ Vercel Edge Functions"]
    PAL --> AL["✓ AWS Lambda@Edge"]
    PAL --> WW["✓ Web Workers (browser background tasks)"]
    PAL --> BR["✓ Browsers (with server-side proxy for CORS)"]

The platform abstraction layer means you write code once and deploy anywhere. A production-ready Cloudflare Worker implementation is included in the repository.

5. 📡 Real-time Progress & Async Processing

Three ways to compile filter lists:

Synchronous:

# Simple command-line compilation
adblock-compiler -c config.json -o output.txt

Streaming:

// Real-time progress with Server-Sent Events
POST /compile/stream
Response: event stream with progress updates

Asynchronous:

// Background queue-based compilation
POST /compile/async
Response: { jobId: "uuid", queuePosition: 2 }

6. 🎨 Modern Web Interface

The included web UI provides:

  • Dashboard - Real-time metrics and queue monitoring
  • Compiler Interface - Visual filter list configuration
  • Admin Panel - Storage and configuration management
  • API Testing - Direct endpoint testing interface
  • Validation UI - Rule validation and AST visualization
┌────────────────────────────────────────────────────┐
│  Adblock Compiler - Interactive Web Dashboard      │
├────────────────────────────────────────────────────┤
│                                                    │
│  Compilation Queue: [████████░░] 8 pending       │
│  Average Time: 2.3s                              │
│                                                    │
│  ┌─────────────────────────────────────────────┐ │
│  │ Configuration                               │ │
│  ├─────────────────────────────────────────────┤ │
│  │ Name:        My Blocklist                  │ │
│  │ Sources:     3 configured                  │ │
│  │ Rules (in):  500,000                       │ │
│  │ Rules (out): 125,000 (after optimization)  │ │
│  │ Size (raw):  12.5 MB                       │ │
│  │ Size (gz):   1.8 MB (85% reduction)        │ │
│  │                                             │ │
│  │ [Compile] [Download] [Share]               │ │
│  └─────────────────────────────────────────────┘ │
│                                                    │
└────────────────────────────────────────────────────┘

7. 📚 Full OpenAPI 3.0.3 Documentation

Complete REST API with:

  • Interactive HTML documentation (Redoc)
  • Postman collections for testing
  • Contract testing for CI/CD
  • Client SDK code generation support
  • Full request/response examples

8. 🎪 Batch Processing

Compile multiple lists simultaneously:

POST /compile/batch
{
  "configurations": [
    { "name": "List 1", ... },
    { "name": "List 2", ... },
    { "name": "List 3", ... }
  ]
}

Process up to 10 lists in parallel with automatic queuing and deduplication.

Getting Started

Installation

Using Deno (recommended):

deno run --allow-read --allow-write --allow-net jsr:@jk-com/adblock-compiler \
  -c config.json -o output.txt

Using Docker:

git clone https://github.com/jaypatrick/adblock-compiler.git
cd adblock-compiler
docker compose up -d
# Access at http://localhost:8787

Build from source:

deno task build
# Creates standalone `adblock-compiler` executable

Quick Example

Convert and compress a blocklist:

adblock-compiler \
  -i hosts.txt \
  -i adblock.txt \
  -o compiled-blocklist.txt

Or use a configuration file for complex scenarios:

adblock-compiler -c config.json -o output.txt

TypeScript API

import { compile } from 'jsr:@jk-com/adblock-compiler';
import type { IConfiguration } from 'jsr:@jk-com/adblock-compiler';

const config: IConfiguration = {
  name: 'Custom Blocklist',
  sources: [
    {
      source: 'https://adguardteam.github.io/AdGuardSDNSFilter/Filters/filter.txt',
      transformations: ['RemoveComments', 'Validate'],
    },
  ],
  transformations: ['Deduplicate'],
};

const result = await compile(config);
await Deno.writeTextFile('blocklist.txt', result.join('\n'));

Architecture & Extensibility

Core Components

FilterCompiler - The main orchestrator that validates configuration, compiles sources, and applies transformations.

WorkerCompiler - A platform-agnostic compiler that works in edge runtimes (Cloudflare Workers, Lambda@Edge, etc.) without file system access.

TransformationRegistry - A plugin system for rule transformations. Extensible and composable.

PlatformDownloader - Handles network requests with retry logic, cycle detection for includes, and preprocessor directives.

Extensibility

Create custom transformations:

import { SyncTransformation, TransformationType } from '@jk-com/adblock-compiler';

class RemoveSocialMediaTransformation extends SyncTransformation {
  public readonly type = 'RemoveSocialMedia' as TransformationType;
  public readonly name = 'Remove Social Media';

  private socialDomains = ['facebook.com', 'twitter.com', 'instagram.com'];

  public executeSync(rules: string[]): string[] {
    return rules.filter((rule) => {
      return !this.socialDomains.some((domain) => rule.includes(domain));
    });
  }
}

// Register and use
const registry = new TransformationRegistry();
registry.register('RemoveSocialMedia' as any, new RemoveSocialMediaTransformation());

Implement custom content fetchers:

class RedisBackedFetcher implements IContentFetcher {
  async canHandle(source: string): Promise<boolean> {
    return source.startsWith('redis://');
  }

  async fetch(source: string): Promise<string> {
    const key = source.replace('redis://', '');
    return await redis.get(key);
  }
}

Use Cases

1. DNS Blockers (AdGuard Home, Pi-hole)

Compile DNS-compatible filter lists from multiple sources, validate rules, and automatically deploy updates.

2. Ad Blockers

Merge multiple ad-blocking lists, convert between formats, and optimize for performance.

3. Content Filtering

Combine content filters from different maintainers with custom exclusions and inclusions.

4. List Maintenance

Automate filter list generation, updates, and quality assurance in CI/CD pipelines.

5. Multi-Source Compilation

Create master lists that aggregate specialized blocklists (malware, tracking, spam, etc.).

6. Format Conversion

Convert between /etc/hosts, adblock, Dnsmasq, Pi-hole, and other formats.

Deployment Options

Local CLI

adblock-compiler -c config.json -o output.txt

Cloudflare Workers

Production-ready worker with web UI, REST API, WebSocket support, and queue integration:

npm install
deno task wrangler:dev   # Local development
deno task wrangler:deploy  # Deploy to Cloudflare

Access at your Cloudflare Workers URL with:

  • Web UI at /
  • API at POST /compile
  • Streaming at POST /compile/stream
  • Async Queue at POST /compile/async

Docker

Complete containerized deployment with:

docker compose up -d
# Access at http://localhost:8787

Includes multi-stage build, health checks, and production-ready configuration.

Edge Functions (Vercel, AWS Lambda@Edge, etc.)

Deploy anywhere with standard Fetch API support:

export default async function handler(request: Request) {
  const compiler = new WorkerCompiler({
    preFetchedContent: { /* sources */ },
  });
  const result = await compiler.compile(config);
  return new Response(result.join('\n'));
}

Advanced Features

Circuit Breaker with Exponential Backoff

Automatic retry logic for unreliable sources:

Request fails
      ↓
Retry after 1s (2^0)
      ↓
Retry after 2s (2^1)
      ↓
Retry after 4s (2^2)
      ↓
Retry after 8s (2^3)
      ↓
Max retries exceeded → Fallback or error

Preprocessor Directives

Advanced compilation with conditional includes:

!#if (os == "windows")
! Windows-specific rules
||example.com^$os=windows
!#endif

!#include https://example.com/rules.txt

Visual Diff Reporting

Track what changed between compilations:

Rules added:     2,341 (+12%)
Rules removed:   1,203 (-6%)
Rules modified:  523
Size change:     +2.1 MB (→ 12.5 MB)
Compression:     85% → 87%

Incremental Compilation

Cache source content and detect changes:

  • Skip recompilation if sources haven't changed
  • Automatic cache invalidation with checksums
  • Configurable storage backends

Conflict Detection

Identify and report conflicting rules:

  • Rules that contradict each other
  • Incompatible modifiers
  • Optimization suggestions

Performance Metrics

The package includes built-in benchmarking and diagnostics:

// Compile with metrics
const result = await compiler.compileWithMetrics(config, true);

// Output includes:
// - Parse time
// - Transformation times (per transformation)
// - Compilation time (total)
// - Output size (raw and compressed)
// - Cache hit rate
// - Memory usage

Integration with Cloudflare Tail Workers for real-time monitoring and error tracking.

Real-World Example

Here's a complete example: creating a master blocklist from multiple sources:

{
  "name": "Master Security Blocklist",
  "description": "Comprehensive blocklist combining security, privacy, and tracking filters",
  "homepage": "https://example.com",
  "license": "GPL-3.0",
  "version": "1.0.0",
  "sources": [
    {
      "name": "AdGuard DNS Filter",
      "source": "https://adguardteam.github.io/AdGuardSDNSFilter/Filters/filter.txt",
      "type": "adblock",
      "transformations": ["RemoveComments", "Validate"]
    },
    {
      "name": "Steven Black's Hosts",
      "source": "https://raw.githubusercontent.com/StevenBlack/hosts/master/hosts",
      "type": "hosts",
      "transformations": ["Compress"],
      "exclusions": ["whitelist.txt"]
    },
    {
      "name": "Local Rules",
      "source": "local-rules.txt",
      "type": "adblock",
      "transformations": ["RemoveComments"]
    }
  ],
  "transformations": ["Deduplicate", "RemoveEmptyLines", "InsertFinalNewLine"],
  "exclusions": ["trusted-domains.txt"]
}

Compile and deploy:

adblock-compiler -c blocklist-config.json -o blocklist.txt

# Or use CI/CD automation
deno run --allow-read --allow-write --allow-net --allow-env \
  jsr:@jk-com/adblock-compiler/cli -c config.json -o output.txt

Community & Feedback

Adblock Compiler is open-source and actively maintained:

  • Repository: https://github.com/jaypatrick/adblock-compiler
  • JSR Package: https://jsr.io/@jk-com/adblock-compiler
  • Issues & Discussions: https://github.com/jaypatrick/adblock-compiler/issues
  • Live Demo: https://adblock-compiler.jayson-knight.workers.dev/

Summary

Adblock Compiler brings modern development practices to filter list management. Whether you're:

  • Managing a single blocklist - Use the CLI for quick compilation
  • Running a production service - Deploy to Cloudflare Workers or Docker
  • Building an application - Import the library and use the TypeScript API
  • Automating updates - Integrate into CI/CD pipelines

Adblock Compiler provides the tools, performance, and flexibility you need.

Key takeaways:

Multi-source - Combine lists from any source ✅ Universal - Run anywhere (Deno, Node, Workers, browsers) ✅ Optimized - 11 transformations for maximum performance ✅ Extensible - Plugin system for custom transformations and fetchers ✅ Production-ready - Used in real-world deployments ✅ Developer-friendly - Full TypeScript support, OpenAPI docs, web UI

Get started today:

# Try it immediately
deno run --allow-read --allow-write --allow-net jsr:@jk-com/adblock-compiler \
  -i https://adguardteam.github.io/AdGuardSDNSFilter/Filters/filter.txt \
  -o my-blocklist.txt

# Or explore the interactive web UI
docker compose up -d

Resources


Ready to simplify your filter list management? Get started with Adblock Compiler today.

Testing Documentation

Guides for testing the Adblock Compiler at various levels.

Contents

Testing Documentation

Overview

This project has comprehensive unit test coverage using Deno's native testing framework. All tests are co-located with source files in the src/ directory.

Test Structure

Tests follow the pattern: *.test.ts files are placed next to their corresponding source files.

Example:

src/cli/
├── ArgumentParser.ts
├── ArgumentParser.test.ts  ← Test file
├── ConfigurationLoader.ts
└── ConfigurationLoader.test.ts  ← Test file

Running Tests

# Run all tests
deno task test

# Run tests with coverage
deno task test:coverage

# Run tests in watch mode
deno task test:watch

# Run specific test file
deno test src/cli/ArgumentParser.test.ts

# Run tests for a specific module
deno test src/transformations/

# Run tests with permissions
deno test --allow-read --allow-write --allow-net --allow-env

Test Coverage

Modules with Complete Coverage

CLI Module

  • ArgumentParser.ts - Argument parsing and validation (22 tests)
  • ConfigurationLoader.ts - JSON loading and validation (16 tests)
  • OutputWriter.ts - File writing (8 tests)

Compiler Module

  • FilterCompiler.ts - Main compilation logic (existing tests)
  • HeaderGenerator.ts - Header generation (16 tests)

Downloader Module

  • ConditionalEvaluator.ts - Boolean expression evaluation (25 tests)
  • ContentFetcher.ts - HTTP/file fetching (18 tests)
  • FilterDownloader.ts - Filter list downloading (existing tests)
  • PreprocessorEvaluator.ts - Directive processing (23 tests)

Transformations Module (11 transformations)

  • CompressTransformation.ts - Hosts to adblock conversion
  • ConvertToAsciiTransformation.ts - Unicode to ASCII conversion
  • DeduplicateTransformation.ts - Remove duplicate rules
  • ExcludeTransformation.ts - Pattern-based exclusion (10 tests)
  • IncludeTransformation.ts - Pattern-based inclusion (11 tests)
  • InsertFinalNewLineTransformation.ts - Final newline insertion
  • InvertAllowTransformation.ts - Allow rule inversion
  • RemoveCommentsTransformation.ts - Comment removal
  • RemoveEmptyLinesTransformation.ts - Empty line removal
  • RemoveModifiersTransformation.ts - Modifier removal
  • TrimLinesTransformation.ts - Whitespace trimming
  • ValidateTransformation.ts - Rule validation
  • TransformationRegistry.ts - Transformation management (13 tests)

Utils Module

  • Benchmark.ts - Performance benchmarking (existing tests)
  • EventEmitter.ts - Event emission (existing tests)
  • logger.ts - Logging functionality (17 tests)
  • RuleUtils.ts - Rule parsing utilities (existing tests)
  • StringUtils.ts - String utilities (existing tests)
  • TldUtils.ts - Domain/TLD parsing (36 tests)
  • Wildcard.ts - Wildcard pattern matching (existing tests)

Configuration Module

  • ConfigurationValidator.ts - Configuration validation (existing tests)

Platform Module

  • platform.test.ts - Platform abstractions (existing tests)

Storage Module

  • PrismaStorageAdapter.test.ts - Storage operations (existing tests)

Test Statistics

  • Total Test Files: 32
  • Total Modules Tested: 40+
  • Test Cases: 500+
  • Coverage: High coverage on all core functionality

Writing New Tests

Test File Template

import { assertEquals, assertExists, assertRejects } from '@std/assert';
import { MyClass } from './MyClass.ts';

Deno.test('MyClass - should do something', () => {
    const instance = new MyClass();
    const result = instance.doSomething();
    assertEquals(result, expectedValue);
});

Deno.test('MyClass - should handle errors', async () => {
    const instance = new MyClass();
    await assertRejects(
        async () => await instance.failingMethod(),
        Error,
        'Expected error message',
    );
});

Best Practices

  1. Co-locate tests - Place test files next to source files
  2. Use descriptive names - MyClass - should do something specific
  3. Test edge cases - Empty inputs, null values, boundary conditions
  4. Use mocks - Mock external dependencies (file system, HTTP)
  5. Keep tests isolated - Each test should be independent
  6. Use async/await - For asynchronous operations
  7. Clean up - Remove temporary files/state after tests

Mock Examples

Mock File System

class MockFileSystem implements IFileSystem {
    private files: Map<string, string> = new Map();

    setFile(path: string, content: string) {
        this.files.set(path, content);
    }

    async readTextFile(path: string): Promise<string> {
        return this.files.get(path) ?? '';
    }

    async writeTextFile(path: string, content: string): Promise<void> {
        this.files.set(path, content);
    }

    async exists(path: string): Promise<boolean> {
        return this.files.has(path);
    }
}

Mock HTTP Client

class MockHttpClient implements IHttpClient {
    private responses: Map<string, Response> = new Map();

    setResponse(url: string, response: Response) {
        this.responses.set(url, response);
    }

    async fetch(url: string): Promise<Response> {
        return this.responses.get(url) ?? new Response('', { status: 404 });
    }
}

Mock Logger

const mockLogger = {
    debug: () => {},
    info: () => {},
    warn: () => {},
    error: () => {},
};

Continuous Integration

Tests are automatically run on:

  • Push to main branch
  • Pull requests
  • Pre-deployment

Coverage Reports

Generate coverage reports:

# Generate coverage
deno task test:coverage

# View coverage report (HTML)
deno coverage coverage --html --include="^file:"

# Generate lcov report for CI
deno coverage coverage --lcov --output=coverage.lcov --include="^file:"

Troubleshooting

Tests fail with permission errors

Make sure to run with required permissions:

deno test --allow-read --allow-write --allow-net --allow-env

Tests timeout

Increase timeout for slow operations:

Deno.test({
    name: 'slow operation',
    fn: async () => {
        // test code
    },
    sanitizeOps: false,
    sanitizeResources: false,
});

Mock not working

Ensure mocks are passed to constructors:

const mockFs = new MockFileSystem();
const instance = new MyClass(mockFs); // Pass mock

Resources

End-to-End Integration Testing

Comprehensive visual testing dashboard for the Adblock Compiler API with real-time event reporting and WebSocket testing.

🎯 Overview

The E2E testing dashboard (/e2e-tests.html) provides:

  • 15+ Integration Tests covering all API endpoints
  • Real-time Visual Feedback with color-coded status
  • WebSocket Testing with live message display
  • Event Log tracking all test activities
  • Performance Metrics (response times, throughput)
  • Interactive Controls (run all, stop, configure URL)

🚀 Quick Start

Access the Dashboard

# Start the server
deno task dev

# Open the test dashboard
open http://localhost:8787/e2e-tests.html

# Or in production
open https://adblock-compiler.jayson-knight.workers.dev/e2e-tests.html

Run Tests

  1. Configure API URL (defaults to http://localhost:8787)
  2. Click "Run All Tests" to execute the full suite
  3. Watch real-time progress in the test cards
  4. Review event log for detailed information
  5. Test WebSocket separately with dedicated controls

📋 Test Coverage

Core API Tests (6 tests)

TestEndpointValidates
API InfoGET /apiVersion info, endpoints list
MetricsGET /metricsPerformance metrics structure
Simple CompilePOST /compileBasic compilation flow
TransformationsPOST /compileMultiple transformations
Cache TestPOST /compileCache headers (X-Cache)
Batch CompilePOST /compile/batchParallel compilation

Streaming Tests (2 tests)

TestEndpointValidates
SSE StreamPOST /compile/streamServer-Sent Events delivery
Event TypesPOST /compile/streamEvent format validation

Queue Tests (4 tests)

TestEndpointValidates
Queue StatsGET /queue/statsQueue metrics
Async CompilePOST /compile/asyncJob queuing (202 or 500)
Batch AsyncPOST /compile/batch/asyncBatch job queuing
Queue ResultsGET /queue/results/{id}Result retrieval

Note: Queue tests accept both 202 (queued) and 500 (not configured) responses since queues may not be available locally.

Performance Tests (3 tests)

TestValidates
Response Time< 2 seconds for API endpoint
Concurrent Requests5 parallel requests succeed
Large Batch10-item batch compilation

🔌 WebSocket Testing

The dashboard includes dedicated WebSocket testing with visual feedback:

Features

  • Connection Status - Visual indicator (connected/disconnected/error)
  • Real-time Messages - All WebSocket messages displayed
  • Progress Bar - Visual compilation progress
  • Event Tracking - Logs all connection/message events

WebSocket Test Flow

1. Click "Connect WebSocket"
   → Establishes WS connection to /ws/compile

2. Click "Run WebSocket Test"
   → Sends compile request with sessionId
   → Receives real-time events:
     - welcome
     - compile:started
     - event (progress updates)
     - compile:complete

3. Click "Disconnect" when done

WebSocket Events

The test validates:

  • ✅ Connection establishment
  • ✅ Welcome message reception
  • ✅ Compile request acceptance
  • ✅ Event streaming (source, transformation, progress)
  • ✅ Completion notification
  • ✅ Error handling

📊 Visual Features

Test Status Colors

🔵 Pending  - Gray (waiting to run)
🟠 Running  - Orange (currently executing, animated pulse)
🟢 Passed   - Green (successful)
🔴 Failed   - Red (error occurred)

Real-time Statistics

Dashboard displays:

  • Total Tests - Number of tests in suite
  • Passed - Successfully completed tests (green)
  • Failed - Tests with errors (red)
  • Duration - Total execution time

Event Log

Color-coded terminal-style log showing:

  • 🔵 Info (Blue) - Test starts, general information
  • 🟢 Success (Green) - Test passes
  • 🔴 Error (Red) - Test failures with error messages
  • 🟠 Warning (Orange) - Non-critical issues

🧪 Test Implementation Details

Test Structure

Each test includes:

{
    id: 'test-id',              // Unique identifier
    name: 'Display Name',       // User-friendly name
    category: 'core',           // Test category
    status: 'pending',          // Current status
    duration: 0,                // Execution time (ms)
    error: null                 // Error message if failed
}

Example Test

async function testCompileSimple(baseUrl) {
    const body = {
        configuration: {
            name: 'E2E Test',
            sources: [{ source: 'test' }],
        },
        preFetchedContent: {
            test: '||example.com^'
        }
    };
    
    const response = await fetch(`${baseUrl}/compile`, {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify(body),
    });
    
    if (!response.ok) throw new Error(`HTTP ${response.status}`);
    const data = await response.json();
    if (!data.success || !data.rules) throw new Error('Invalid response');
}

Adding Custom Tests

  1. Add test definition to initializeTests():
{ 
    id: 'my-test', 
    name: 'My Custom Test', 
    category: 'core', 
    status: 'pending', 
    duration: 0 
}
  1. Implement test function:
async function testMyCustomTest(baseUrl) {
    // Your test logic here
    const response = await fetch(`${baseUrl}/my-endpoint`);
    if (!response.ok) throw new Error(`Failed: ${response.status}`);
}
  1. Add case to runTest() switch statement:
case 'my-test':
    await testMyCustomTest(baseUrl);
    break;

🎨 UI Components

Test Cards

Each category has a dedicated card:

  • Core API - Core endpoints (6 tests)
  • Streaming - SSE/WebSocket (2 tests)
  • Queue - Async operations (4 tests)
  • Performance - Speed/throughput (3 tests)

Controls

  • API Base URL - Configurable (local/production)
  • Run All Tests - Execute full suite sequentially
  • Stop - Abort running tests
  • WebSocket Controls - Connect, test, disconnect

📈 Performance Validation

Response Time Test

Validates API response time < 2 seconds:

const start = Date.now();
const response = await fetch(`${baseUrl}/api`);
const duration = Date.now() - start;

if (duration > 2000) throw new Error(`Too slow: ${duration}ms`);

Concurrent Requests Test

Verifies 5 parallel requests succeed:

const promises = Array(5).fill(null).map(() => 
    fetch(`${baseUrl}/api`)
);

const responses = await Promise.all(promises);
const failures = responses.filter(r => !r.ok);

if (failures.length > 0) {
    throw new Error(`${failures.length}/5 failed`);
}

Large Batch Test

Tests 10-item batch compilation:

const requests = Array(10).fill(null).map((_, i) => ({
    id: `item-${i}`,
    configuration: { name: `Test ${i}`, sources: [...] },
    preFetchedContent: { ... }
}));

const response = await fetch(`${baseUrl}/compile/batch`, {
    method: 'POST',
    body: JSON.stringify({ requests }),
});

🔍 Debugging

View Test Details

Event log shows:

  • Test start times
  • Response times
  • Error messages
  • Cache hit/miss status
  • Queue availability

Common Issues

All tests fail immediately:

❌ Check server is running at configured URL
curl http://localhost:8787/api

Queue tests return 500:

⚠️ Expected - queues not configured locally
Deploy to Cloudflare Workers to test queue functionality

WebSocket won't connect:

❌ Check WebSocket endpoint is available
Ensure /ws/compile route is implemented

SSE tests timeout:

⚠️ Server may be slow or not streaming events
Check compile/stream endpoint implementation

🚀 CI/CD Integration

GitHub Actions Example

name: E2E Tests

on: [push, pull_request]

jobs:
  e2e-tests:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      
      - uses: denoland/setup-deno@v1
      
      - name: Start server
        run: deno task dev &
        
      - name: Wait for server
        run: sleep 5
      
      - name: Install Playwright
        run: npm install -g playwright
      
      - name: Run E2E tests
        run: |
          playwright test --headed \
            --base-url http://localhost:8787 \
            e2e-tests.html

Automated Testing

Use Playwright or Puppeteer to automate:

// example-playwright-test.js
const { test, expect } = require('@playwright/test');

test('E2E test suite passes', async ({ page }) => {
    await page.goto('http://localhost:8787/e2e-tests.html');
    
    // Click run all tests
    await page.click('#runAllBtn');
    
    // Wait for completion
    await page.waitForSelector('#runAllBtn:not([disabled])', {
        timeout: 60000
    });
    
    // Check stats
    const passed = await page.textContent('#passedTests');
    const failed = await page.textContent('#failedTests');
    
    expect(parseInt(failed)).toBe(0);
    expect(parseInt(passed)).toBeGreaterThan(0);
});

🛠️ Configuration

Environment-specific URLs

// Development
document.getElementById('apiUrl').value = 'http://localhost:8787';

// Staging
document.getElementById('apiUrl').value = 'https://staging.example.com';

// Production
document.getElementById('apiUrl').value = 'https://adblock-compiler.jayson-knight.workers.dev';

Custom Test Timeout

Modify SSE test timeout:

const timeout = setTimeout(() => {
    reader.cancel();
    resolve(); // or reject()
}, 5000); // 5 seconds instead of default 3

💡 Best Practices

  1. Run tests before committing

    # Open dashboard and run tests
    open http://localhost:8787/e2e-tests.html
    
  2. Test against local server first

    • Faster feedback
    • Doesn't consume production quotas
    • Easier debugging
  3. Use WebSocket test for real-time validation

    • Verifies bidirectional communication
    • Tests event streaming
    • Validates session management
  4. Monitor event log for issues

    • Cache behavior
    • Response times
    • Queue availability
    • Error messages
  5. Update tests when adding endpoints

    • Add test definition
    • Implement test function
    • Add to switch statement
    • Update category count

🎯 Summary

The E2E testing dashboard provides:

Comprehensive Coverage - All API endpoints tested
Visual Feedback - Real-time status and progress
WebSocket Testing - Dedicated real-time testing
Event Tracking - Complete audit log
Performance Validation - Response time and throughput
Easy to Extend - Simple test addition process

Access it at: http://localhost:8787/e2e-tests.html 🚀

Postman API Testing Guide

This guide explains how to use the Postman collection to test the Adblock Compiler OpenAPI endpoints.

Quick Start

1. Import the Collection

  1. Open Postman
  2. Click Import in the top left
  3. Select File and choose docs/postman/postman-collection.json
  4. The collection will appear in your workspace

2. Import the Environment

  1. Click Import again
  2. Select File and choose docs/postman/postman-environment.json
  3. Select the "Adblock Compiler - Local" environment from the dropdown in the top right

3. Start the Server

# Start local development server
deno task dev

# Or using Docker
docker compose up -d

The server will be available at http://localhost:8787

4. Run Tests

You can run tests individually or as a collection:

  • Individual Request: Click any request and press Send
  • Folder: Right-click a folder and select Run folder
  • Entire Collection: Click the Run button next to the collection name

Collection Structure

The collection is organized into the following folders:

📊 Metrics

  • Get API Info - Retrieves API version and available endpoints
  • Get Performance Metrics - Fetches aggregated performance data

⚙️ Compilation

  • Compile Simple Filter List - Basic compilation with pre-fetched content
  • Compile with Transformations - Tests multiple transformations (RemoveComments, Validate, Deduplicate)
  • Compile with Cache Check - Verifies caching behavior (X-Cache header)
  • Compile Invalid Configuration - Error handling test

📡 Streaming

  • Compile with SSE Stream - Server-Sent Events streaming test

📦 Batch Processing

  • Batch Compile Multiple Lists - Compile 2 lists in parallel
  • Batch Compile - Max Limit Test - Test the 10-item batch limit

🔄 Queue

  • Queue Async Compilation - Queue a job for async processing
  • Queue Batch Async Compilation - Queue multiple jobs
  • Get Queue Stats - Retrieve queue metrics
  • Get Queue Results - Fetch results using requestId

🔍 Edge Cases

  • Empty Configuration - Test with empty request body
  • Missing Required Fields - Test validation
  • Large Batch Request (>10) - Test batch size limit enforcement

Test Assertions

Each request includes automated tests that verify:

Response Validation

pm.test('Status code is 200', function () {
    pm.response.to.have.status(200);
});

Schema Validation

pm.test('Response is successful', function () {
    const jsonData = pm.response.json();
    pm.expect(jsonData.success).to.be.true;
    pm.expect(jsonData).to.have.property('rules');
});

Business Logic

pm.test('Rules are deduplicated', function () {
    const jsonData = pm.response.json();
    const uniqueRules = new Set(jsonData.rules.filter(r => !r.startsWith('!')));
    pm.expect(uniqueRules.size).to.equal(jsonData.rules.filter(r => !r.startsWith('!')).length);
});

Header Validation

pm.test('Check cache headers', function () {
    pm.expect(pm.response.headers.get('X-Cache')).to.be.oneOf(['HIT', 'MISS']);
});

Variables

The collection uses the following variables:

  • baseUrl - Local development server URL (default: http://localhost:8787)
  • prodUrl - Production server URL
  • requestId - Auto-populated from async compilation responses

Switching Between Environments

To test against production:

  1. Change the baseUrl variable to {{prodUrl}}
  2. Or create a new environment for production

Running Collection with Newman (CLI)

You can run the collection from the command line using Newman:

# Install Newman
npm install -g newman

# Run the collection against local server
newman run docs/postman/postman-collection.json -e docs/postman/postman-environment.json

# Run with detailed output
newman run docs/postman/postman-collection.json -e docs/postman/postman-environment.json --reporters cli,json

# Run specific folder
newman run docs/postman/postman-collection.json -e docs/postman/postman-environment.json --folder "Compilation"

CI/CD Integration

GitHub Actions Example

name: API Tests
on: [push, pull_request]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      
      - name: Start server
        run: docker compose up -d
      
      - name: Wait for server
        run: sleep 5
      
      - name: Install Newman
        run: npm install -g newman
      
      - name: Run Postman tests
        run: newman run docs/postman/postman-collection.json -e docs/postman/postman-environment.json
      
      - name: Stop server
        run: docker compose down

Advanced Testing

Pre-request Scripts

You can add pre-request scripts to generate dynamic data:

// Generate random filter rules
const rules = Array.from({length: 10}, (_, i) => `||example${i}.com^`);
pm.collectionVariables.set('dynamicRules', rules.join('\\n'));

Test Sequences

Run requests in sequence to test workflows:

  1. Queue Async Compilation → captures requestId
  2. Get Queue Stats → verify job is pending
  3. Get Queue Results → retrieve compiled results

Performance Testing

Use the Collection Runner with multiple iterations:

  1. Click Run on the collection
  2. Set Iterations to desired number (e.g., 100)
  3. Set Delay between requests (e.g., 100ms)
  4. View performance metrics in the run summary

Troubleshooting

Server Not Responding

# Check if server is running
curl http://localhost:8787/api

# Check Docker logs
docker compose logs -f

# Restart server
docker compose restart

Queue Tests Failing

Queue tests may return 500 if Cloudflare Queues aren't configured:

{
  "success": false,
  "error": "Queue bindings are not available..."
}

This is expected for local development without queue configuration.

Rate Limiting

If you hit rate limits (429 responses), wait for the rate limit window to reset or adjust RATE_LIMIT_MAX_REQUESTS in the server configuration.

Best Practices

  1. Run tests before commits - Ensure API compatibility
  2. Test against local first - Avoid production impact
  3. Use environments - Separate dev/staging/prod configurations
  4. Review test results - Don't ignore failed assertions
  5. Update tests - Keep tests in sync with OpenAPI spec changes

Support

For issues or questions:

CI/CD Workflows Documentation

Documentation for GitHub Actions CI/CD workflows, automation, and environment setup.

Contents

GitHub Actions Workflows

This document describes the GitHub Actions workflows used in this repository and explains the recent improvements made for better performance and maintainability.

Overview

The repository uses four main workflows:

  1. CI (ci.yml) - Continuous Integration for code quality and deployment
  2. Version Bump (version-bump.yml) - Automatic or manual version updates with changelog
  3. Create Version Tag (create-version-tag.yml) - Creates release tags for merged version bump PRs
  4. Release (release.yml) - Build and publish releases

CI Workflow

Trigger: Push to main, Pull Requests, Manual dispatch

Jobs

Parallel Quality Checks (runs concurrently)

  1. Lint - Code linting with Deno
  2. Format - Code formatting check with Deno
  3. Type Check - TypeScript type checking for all entry points
  4. Test - Run test suite with coverage; coverage artifact uploaded on both PRs and main push
  5. Security - Trivy vulnerability scanning
  6. Frontend Build - Angular frontend lint, test, build, and artifact upload (single merged job)
  7. Validate Cloudflare Schema - Runs deno task schema:cloudflare and verifies that docs/api/cloudflare-schema.yaml (Cloudflare API Shield schema generated from the OpenAPI spec) is up to date

PR-Only Parallel Job (needs frontend-build artifact)

  1. Verify Deploy - Cloudflare Worker build dry-run (deno task wrangler:verify); runs on PRs only, waits for the frontend-build artifact but otherwise runs in parallel with the quality checks above

Sequential Jobs (run after all checks pass)

  1. CI Gate - Python script verifying all upstream jobs passed or were acceptably skipped; blocks publish and deploy
  2. Publish - Publish to JSR (main only, after CI gate passes)
  3. Deploy - Deploy to Cloudflare (main only, when enabled, after CI gate passes)

Composite Actions

A reusable composite action handles Deno dependency installation with a 3-attempt retry loop and DENO_TLS_CA_STORE=system:

# Used in all jobs that require Deno deps
- uses: ./.github/actions/deno-install

The action is defined in .github/actions/deno-install/action.yml and is used by the typecheck, test, publish, verify-deploy, and deploy jobs.

Key Improvements

  • Parallelization: Lint, format, typecheck, test, and security scans run simultaneously
  • Proper Gating: ci-gate blocks publish/deploy until lint, format, typecheck, test, security, frontend-build, and verify-deploy all pass
  • Worker Build Verified on PRs: verify-deploy runs a Cloudflare Worker dry-run on every PR so Worker build failures are caught before merge
  • Composite Action: deno install retry logic extracted to .github/actions/deno-install — no duplication across jobs
  • Merged Frontend Jobs: frontend (lint+test) and frontend-build (build+artifact) are now a single frontend-build job — one pnpm install per run
  • Frozen Lockfile: pnpm install --frozen-lockfile enforced — CI fails if pnpm-lock.yaml drifts from package.json
  • Coverage on PRs: Test coverage artifact uploaded on pull requests, not just main push
  • SHA-Pinned Actions: All third-party actions pinned to full commit SHAs with version comments (supply-chain hardening)
  • Better Caching: Includes deno.lock in cache key for more precise invalidation
  • Comprehensive Type Checking: Checks all entry points (index.ts, cli.ts, worker.ts, tail.ts)
  • Consolidated Worker Deployment: Main and tail Cloudflare Workers deployed from a single CI deploy job (no separate Pages deployment)
  • Migration Error Handling: run_migration() shell function distinguishes real errors from "already applied" idempotency messages

Performance Gains

  • Before: ~5-7 minutes (sequential execution)
  • After: ~2-3 minutes (parallel execution)
  • Improvement: ~40-50% faster

Release Workflow

Trigger: Push tags (v*), Manual dispatch with version input

Jobs

  1. Validate - Run full CI suite before building anything
  2. Build Binaries - Build native binaries for all platforms (parallel matrix)
  3. Build Docker - Build and push multi-platform Docker images
  4. Create Release - Generate GitHub release with all artifacts

Key Improvements

  • Pre-build Validation: Ensures code quality before expensive build operations
  • Better Caching: Per-target caching for binary builds
  • Simplified Asset Prep: Uses find instead of complex loop
  • Cleaner Structure: Removed verbose comments, organized logically

Performance Gains

  • Before: ~15-20 minutes (no validation, potential failures late)
  • After: ~12-15 minutes (early validation prevents wasted builds)
  • Improvement: Faster failure detection, ~20% reduction in failed build time

Version Bump Workflow

Trigger: Push to main, Manual dispatch

Jobs

  1. Version Bump - Automatically analyze commits and bump version, or manually specify bump type
  2. Trigger Release - Optionally trigger release workflow (if requested via manual dispatch)

Key Features

  • Automatic Detection: Uses conventional commits to determine version bump type
  • Manual Override: Can manually specify patch/minor/major bump
  • Changelog Generation: Automatically generates changelog entries from commits
  • PR-Based: Creates pull request with version changes for review
  • Skip Logic: Skips if [skip ci] or [skip version] in commit message

Conventional Commits Support

  • feat: → minor bump
  • fix: → patch bump
  • perf: → patch bump
  • feat!: or BREAKING CHANGE: → major bump

Changes from Previous Version

  • Consolidated: Merged auto-version-bump.yml and version-bump.yml into single workflow
  • Simplified: Single workflow handles both automatic and manual triggers
  • Improved: Better error handling and verification steps

Create Version Tag Workflow

Trigger: PR closed (for version bump PRs only)

Jobs

  1. Create Tag - Creates release tag when version bump PR is merged

Key Features

  • Automatic Tagging: Creates v<version> tag when version bump PR is merged
  • Idempotent: Checks if tag exists before creating
  • Cleanup: Deletes version bump branch after tagging
  • Release Trigger: Tag automatically triggers release workflow

Caching Strategy

All workflows now use an improved caching strategy:

key: deno-${{ runner.os }}-${{ hashFiles('deno.json', 'deno.lock') }}
restore-keys: |
    deno-${{ runner.os }}-

This ensures:

  • Cache is invalidated when dependencies change
  • Fallback to OS-specific cache if exact match not found
  • Faster dependency installation

Environment Variables

Common

  • DENO_VERSION: '2.x' - Deno version used across all workflows

CI Workflow

  • CODECOV_TOKEN - For uploading test coverage (optional)
  • CLOUDFLARE_API_TOKEN - For Cloudflare deployments (optional)
  • CLOUDFLARE_ACCOUNT_ID - For Cloudflare deployments (optional)

Required Variables

  • ENABLE_CLOUDFLARE_DEPLOY - Repository variable to enable/disable Cloudflare deployments

Permissions

All workflows use minimal permissions following the principle of least privilege:

CI

  • contents: read - For checking out code
  • id-token: write - For JSR publishing (publish job only)
  • security-events: write - For uploading security scan results (security job only)

Release

  • contents: write - For creating releases and tags
  • packages: write - For publishing Docker images

Version Bump

  • contents: write - For committing version changes
  • actions: write - For triggering release workflow

Concurrency

All workflows use concurrency groups to prevent multiple runs on the same ref:

concurrency:
    group: ${{ github.workflow }}-${{ github.ref }}
    cancel-in-progress: true

This ensures:

  • Only one workflow runs per branch/PR at a time
  • Outdated runs are automatically cancelled when new commits are pushed
  • Saves CI minutes and prevents race conditions

Best Practices

When to Use Each Workflow

  1. CI: Automatically runs on every push/PR - no manual intervention needed
  2. Version Bump: Run manually when you want to bump the version
  3. Release: Automatically triggered by version tags, or run manually for specific versions
  1. Make your changes on a feature branch
  2. Create a PR and wait for CI to pass
  3. Merge to main
  4. Version bump workflow automatically runs and creates a version bump PR
  5. Review and merge the version bump PR
  6. Create version tag workflow automatically creates the release tag
  7. Release workflow automatically builds and publishes the release

Or for manual version bump:

  1. Make your changes on a feature branch
  2. Create a PR and wait for CI to pass
  3. Merge to main
  4. Run "Version Bump" workflow manually with desired bump type
  5. Optionally check "Create a release after bumping" to skip the PR review step

Troubleshooting

Publish Fails with "Version Already Exists"

This is expected and not an error. The workflow treats this as success to allow re-running the workflow.

Deploy Jobs Don't Run

Check that ENABLE_CLOUDFLARE_DEPLOY repository variable is set to 'true' (as a string).

Binary Build Fails for ARM64 Linux

The ARM64 Linux build uses cross-compilation. If it fails, check Deno's compatibility with the target platform in the Deno release notes.

Migration Notes

If you're migrating from the old workflows:

Breaking Changes

  • Version bump no longer runs automatically on PR open
  • Example files are no longer automatically updated during version bump
  • Deploy jobs now combined into single job

Non-Breaking Changes

  • All existing secrets and variables work the same way
  • Workflow dispatch inputs are backwards compatible
  • Release process is unchanged

Future Improvements

Potential areas for further optimization:

  • Add workflow to automatically create PRs for dependency updates
  • Add scheduled security scanning (weekly)
  • Consider splitting test job by test type (unit vs integration)
  • Add benchmark tracking over time
  • Add automatic changelog generation
  • Add path-based filtering to skip frontend-build on backend-only PRs (currently blocked by verify-deploy's artifact dependency)

GitHub Actions Environment Setup

This project uses a layered environment configuration system that automatically loads variables based on the git branch.

How It Works

The .github/actions/setup-env composite action mimics the behavior of .envrc for GitHub Actions workflows:

  1. Detects the environment from the branch name
  2. Loads .env (base configuration)
  3. Loads .env.$ENV (environment-specific)
  4. Exports all variables to $GITHUB_ENV

Branch to Environment Mapping

Branch PatternEnvironmentLoaded Files
mainproduction.env, .env.production
dev, developdevelopment.env, .env.development
Other branches (with file)Custom.env, .env.$BRANCH_NAME
Other branches (no file)Default.env

Usage in Workflows

Basic Usage

steps:
  - uses: actions/checkout@v4
  
  - name: Load environment variables
    uses: ./.github/actions/setup-env
  
  - name: Use environment variables
    run: |
      echo "Compiler version: $COMPILER_VERSION"
      echo "Port: $PORT"

With Custom Branch

- name: Load environment variables for specific branch
  uses: ./.github/actions/setup-env
  with:
    branch: 'staging'

Access Detected Environment

- name: Load environment variables
  id: env
  uses: ./.github/actions/setup-env

- name: Use detected environment
  run: echo "Running in ${{ steps.env.outputs.environment }} environment"

Environment Variables Available

After loading, the following variables are available:

From .env (all environments)

  • COMPILER_VERSION - Current compiler version
  • PORT - Server port (default: 8787)
  • DENO_DIR - Deno cache directory

From .env.development (dev/develop branches)

  • DATABASE_URL - Local SQLite database path
  • TURNSTILE_SITE_KEY - Test Turnstile site key (always passes)
  • TURNSTILE_SECRET_KEY - Test Turnstile secret key

From .env.production (main branch)

  • DATABASE_URL - Production database URL (placeholder)
  • TURNSTILE_SITE_KEY - Production site key (placeholder)
  • TURNSTILE_SECRET_KEY - Production secret key (placeholder)

Note: Production secrets should be set using GitHub Secrets, not loaded from files.

Setting Production Secrets

For production deployments, set secrets in GitHub repository settings:

env:
  CLOUDFLARE_API_TOKEN: ${{ secrets.CLOUDFLARE_API_TOKEN }}
  ADMIN_KEY: ${{ secrets.ADMIN_KEY }}
  TURNSTILE_SECRET_KEY: ${{ secrets.TURNSTILE_SECRET_KEY }}

Required secrets for production:

  • CLOUDFLARE_API_TOKEN - Cloudflare API token
  • CLOUDFLARE_ACCOUNT_ID - Cloudflare account ID
  • ADMIN_KEY - Admin API key
  • TURNSTILE_SITE_KEY - Production Turnstile site key
  • TURNSTILE_SECRET_KEY - Production Turnstile secret key

Example: Deploy Workflow

jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      
      - name: Load environment variables
        id: env
        uses: ./.github/actions/setup-env
      
      - name: Deploy to environment
        run: |
          if [ "${{ steps.env.outputs.environment }}" = "production" ]; then
            wrangler deploy  # production is the top-level default env; no --env flag needed
          else
            wrangler deploy --env development
          fi
        env:
          # Production secrets override file-based config
          CLOUDFLARE_API_TOKEN: ${{ secrets.CLOUDFLARE_API_TOKEN }}
          ADMIN_KEY: ${{ secrets.ADMIN_KEY }}

Comparison: Local vs CI

AspectLocal DevelopmentGitHub Actions
Loader.envrc + direnv.github/actions/setup-env
DetectionGit branch (real-time)github.ref_name
Secrets.env.local (not committed)GitHub Secrets
Override.env.local overrides allGitHub env vars override files

Debugging

To see what environment is detected and what variables are loaded:

- name: Load environment variables
  id: env
  uses: ./.github/actions/setup-env

- name: Debug environment
  run: |
    echo "Environment: ${{ steps.env.outputs.environment }}"
    echo "Branch: ${{ github.ref_name }}"
    env | grep -E 'COMPILER_VERSION|PORT|DATABASE_URL' || true

Security Best Practices

  1. DO use GitHub Secrets for production credentials
  2. DO load base config from .env files
  3. DO use test keys in .env.development
  4. DON'T commit real secrets to .env.* files
  5. DON'T echo secret values in workflow logs
  6. DON'T use production credentials in PR builds

Workflow Diagrams

This document contains comprehensive workflow diagrams for the adblock-compiler system, including Cloudflare Workflows, queue-based processing, compilation pipelines, and supporting processes.

Table of Contents


System Architecture Overview

High-level view of all processing systems and their interactions.

flowchart TB
    subgraph "Client Layer"
        WEB[Web UI]
        API_CLIENT[API Clients]
        CRON[Cron Scheduler]
    end

    subgraph "API Layer"
        direction TB
        SYNC[Synchronous Endpoints<br/>/compile, /compile/batch]
        ASYNC[Async Endpoints<br/>/compile/async, /compile/batch/async]
        WORKFLOW_API[Workflow Endpoints<br/>/workflow/*]
        STREAM[Streaming Endpoint<br/>/compile/stream]
    end

    subgraph "Processing Layer"
        direction TB

        subgraph "Cloudflare Workflows"
            CW[CompilationWorkflow]
            BCW[BatchCompilationWorkflow]
            CWW[CacheWarmingWorkflow]
            HMW[HealthMonitoringWorkflow]
        end

        subgraph "Cloudflare Queues"
            STD_Q[(Standard Queue)]
            HIGH_Q[(High Priority Queue)]
            DLQ[(Dead Letter Queue)]
        end

        CONSUMER[Queue Consumer]
    end

    subgraph "Compilation Engine"
        FC[FilterCompiler]
        SC[SourceCompiler]
        TP[TransformationPipeline]
        HG[HeaderGenerator]
    end

    subgraph "Storage Layer"
        KV_CACHE[(KV: COMPILATION_CACHE)]
        KV_METRICS[(KV: METRICS)]
        KV_RATE[(KV: RATE_LIMIT)]
        KV_EVENTS[(KV: Workflow Events)]
        D1[(D1: Analytics)]
    end

    subgraph "External Sources"
        EASYLIST[EasyList]
        ADGUARD[AdGuard]
        OTHER[Other Filter Sources]
    end

    %% Client connections
    WEB --> SYNC
    WEB --> STREAM
    API_CLIENT --> SYNC
    API_CLIENT --> ASYNC
    API_CLIENT --> WORKFLOW_API
    CRON --> CWW
    CRON --> HMW

    %% API to Processing
    SYNC --> FC
    ASYNC --> STD_Q
    ASYNC --> HIGH_Q
    WORKFLOW_API --> CW
    WORKFLOW_API --> BCW
    WORKFLOW_API --> CWW
    WORKFLOW_API --> HMW

    %% Queue processing
    STD_Q --> CONSUMER
    HIGH_Q --> CONSUMER
    CONSUMER --> FC
    CONSUMER -.-> DLQ

    %% Workflow processing
    CW --> FC
    BCW --> FC
    CWW --> FC
    HMW --> EASYLIST
    HMW --> ADGUARD
    HMW --> OTHER

    %% Compilation flow
    FC --> SC
    SC --> TP
    TP --> HG

    %% External sources
    SC --> EASYLIST
    SC --> ADGUARD
    SC --> OTHER

    %% Storage
    FC --> KV_CACHE
    CW --> KV_EVENTS
    BCW --> KV_EVENTS
    CONSUMER --> KV_METRICS
    CW --> KV_METRICS
    BCW --> KV_METRICS
    HMW --> D1

    style CW fill:#e1f5ff,stroke:#0288d1
    style BCW fill:#e1f5ff,stroke:#0288d1
    style CWW fill:#e1f5ff,stroke:#0288d1
    style HMW fill:#e1f5ff,stroke:#0288d1
    style STD_Q fill:#c8e6c9,stroke:#388e3c
    style HIGH_Q fill:#fff9c4,stroke:#fbc02d
    style DLQ fill:#ffcdd2,stroke:#d32f2f
    style KV_CACHE fill:#e1bee7,stroke:#7b1fa2

Processing Path Comparison

PathEntry PointPersistenceCrash RecoveryBest For
Synchronous/compileNoneN/AInteractive requests
Queue-Based/compile/asyncQueueMessage retryBatch operations
Workflows/workflow/*Per-stepResume from checkpointLong-running, critical
Streaming/compile/streamNoneN/AReal-time progress

Cloudflare Workflows

Cloudflare Workflows provide durable execution with automatic state persistence, crash recovery, and observable progress.

Workflow System Architecture

flowchart TB
    subgraph "Workflow Triggers"
        API_TRIGGER[API Request<br/>POST /workflow/*]
        CRON_TRIGGER[Cron Schedule<br/>0 */6 * * *]
        MANUAL[Manual Trigger]
    end

    subgraph "Workflow Engine"
        WF_RUNTIME[Cloudflare<br/>Workflow Runtime]

        subgraph "State Management"
            CHECKPOINT[Step Checkpoints]
            STATE_PERSIST[State Persistence]
            CRASH_DETECT[Crash Detection]
        end
    end

    subgraph "Available Workflows"
        direction LR
        COMP_WF[CompilationWorkflow<br/>Single compilation]
        BATCH_WF[BatchCompilationWorkflow<br/>Multiple compilations]
        CACHE_WF[CacheWarmingWorkflow<br/>Pre-populate cache]
        HEALTH_WF[HealthMonitoringWorkflow<br/>Source availability]
    end

    subgraph "Event System"
        EVENT_EMIT[Event Emitter]
        KV_EVENTS[(KV: workflow:events:*)]
        EVENT_API[GET /workflow/events/:id]
    end

    subgraph "Metrics & Analytics"
        AE[Analytics Engine]
        KV_METRICS[(KV: workflow:metrics)]
        METRICS_API[GET /workflow/metrics]
    end

    API_TRIGGER --> WF_RUNTIME
    CRON_TRIGGER --> WF_RUNTIME
    MANUAL --> WF_RUNTIME

    WF_RUNTIME --> COMP_WF
    WF_RUNTIME --> BATCH_WF
    WF_RUNTIME --> CACHE_WF
    WF_RUNTIME --> HEALTH_WF

    WF_RUNTIME --> CHECKPOINT
    CHECKPOINT --> STATE_PERSIST
    CRASH_DETECT --> CHECKPOINT

    COMP_WF --> EVENT_EMIT
    BATCH_WF --> EVENT_EMIT
    CACHE_WF --> EVENT_EMIT
    HEALTH_WF --> EVENT_EMIT

    EVENT_EMIT --> KV_EVENTS
    KV_EVENTS --> EVENT_API

    COMP_WF --> AE
    BATCH_WF --> AE
    CACHE_WF --> AE
    HEALTH_WF --> AE
    AE --> KV_METRICS
    KV_METRICS --> METRICS_API

    style COMP_WF fill:#e3f2fd,stroke:#1976d2
    style BATCH_WF fill:#e8f5e9,stroke:#388e3c
    style CACHE_WF fill:#fff8e1,stroke:#f57c00
    style HEALTH_WF fill:#fce4ec,stroke:#c2185b

CompilationWorkflow

Handles single asynchronous compilation requests with durable state between steps.

flowchart TD
    subgraph "Step 1: validate"
        START([Workflow Start]) --> V_START[Start Validation]
        V_START --> V_EMIT1[Emit: workflow:started]
        V_EMIT1 --> V_CHECK{Configuration Valid?}
        V_CHECK -->|Yes| V_EMIT2[Emit: workflow:step:completed<br/>Progress: 10%]
        V_CHECK -->|No| V_ERROR[Emit: workflow:failed]
        V_ERROR --> RETURN_ERROR[Return Error Result]
    end

    subgraph "Step 2: compile-sources"
        V_EMIT2 --> C_START[Start Compilation]
        C_START --> C_EMIT1[Emit: workflow:step:started<br/>step: compile-sources]

        C_EMIT1 --> C_FETCH[Fetch Sources in Parallel]
        C_FETCH --> S1[Source 1]
        C_FETCH --> S2[Source 2]
        C_FETCH --> SN[Source N]

        S1 --> S1_EMIT[Emit: source:fetch:completed]
        S2 --> S2_EMIT[Emit: source:fetch:completed]
        SN --> SN_EMIT[Emit: source:fetch:completed]

        S1_EMIT --> C_COMBINE
        S2_EMIT --> C_COMBINE
        SN_EMIT --> C_COMBINE[Combine Rules]

        C_COMBINE --> C_TRANSFORM[Apply Transformations]
        C_TRANSFORM --> T_LOOP{For Each Transformation}
        T_LOOP --> T_APPLY[Apply Transformation]
        T_APPLY --> T_EMIT[Emit: transformation:completed]
        T_EMIT --> T_LOOP
        T_LOOP -->|Done| C_HEADER[Generate Header]

        C_HEADER --> C_EMIT2[Emit: workflow:step:completed<br/>Progress: 70%]
    end

    subgraph "Step 3: cache-result"
        C_EMIT2 --> CACHE_START[Start Caching]
        CACHE_START --> CACHE_COMPRESS[Gzip Compress Result]
        CACHE_COMPRESS --> CACHE_STORE[Store in KV<br/>TTL: 24 hours]
        CACHE_STORE --> CACHE_EMIT[Emit: cache:stored<br/>Progress: 90%]
    end

    subgraph "Step 4: update-metrics"
        CACHE_EMIT --> M_START[Update Metrics]
        M_START --> M_TRACK[Track in Analytics Engine]
        M_TRACK --> M_STORE[Store Metrics in KV]
        M_STORE --> M_EMIT[Emit: workflow:completed<br/>Progress: 100%]
    end

    M_EMIT --> RETURN_SUCCESS[Return Success Result]
    RETURN_ERROR --> END([Workflow End])
    RETURN_SUCCESS --> END

    style V_START fill:#e3f2fd
    style C_START fill:#fff8e1
    style CACHE_START fill:#e8f5e9
    style M_START fill:#f3e5f5
    style RETURN_SUCCESS fill:#c8e6c9
    style RETURN_ERROR fill:#ffcdd2

Retry Configuration:

StepRetriesDelayBackoffTimeout
validate11slinear30s
compile-sources330sexponential5m
cache-result22slinear30s
update-metrics11slinear10s

BatchCompilationWorkflow

Processes multiple compilations with per-chunk durability and crash recovery.

flowchart TD
    subgraph "Initialization"
        START([Batch Workflow Start]) --> INIT[Extract Batch Parameters]
        INIT --> EMIT_START[Emit: workflow:started<br/>batchSize, requestCount]
    end

    subgraph "Step 1: validate-batch"
        EMIT_START --> VAL_START[Validate All Configurations]
        VAL_START --> VAL_LOOP{For Each Request}
        VAL_LOOP --> VAL_CHECK{Config Valid?}
        VAL_CHECK -->|Yes| VAL_NEXT[Add to Valid List]
        VAL_CHECK -->|No| VAL_REJECT[Add to Rejected List]
        VAL_NEXT --> VAL_LOOP
        VAL_REJECT --> VAL_LOOP
        VAL_LOOP -->|Done| VAL_RESULT{Any Valid?}
        VAL_RESULT -->|No| BATCH_ERROR[Return: All Failed]
        VAL_RESULT -->|Yes| VAL_EMIT[Emit: workflow:step:completed<br/>validCount, rejectedCount]
    end

    subgraph "Step 2-N: compile-chunk-N"
        VAL_EMIT --> CHUNK_INIT[Split into Chunks<br/>MAX_CONCURRENT = 3]

        CHUNK_INIT --> CHUNK1[Chunk 1]

        subgraph "Chunk Processing"
            CHUNK1 --> C1_START[Step: compile-chunk-1]
            C1_START --> C1_EMIT[Emit: workflow:step:started]

            C1_EMIT --> C1_P1[Compile Item 1]
            C1_EMIT --> C1_P2[Compile Item 2]
            C1_EMIT --> C1_P3[Compile Item 3]

            C1_P1 --> C1_R1{Result}
            C1_P2 --> C1_R2{Result}
            C1_P3 --> C1_R3{Result}

            C1_R1 -->|Success| C1_S1[Cache Result 1]
            C1_R1 -->|Failure| C1_F1[Record Error 1]
            C1_R2 -->|Success| C1_S2[Cache Result 2]
            C1_R2 -->|Failure| C1_F2[Record Error 2]
            C1_R3 -->|Success| C1_S3[Cache Result 3]
            C1_R3 -->|Failure| C1_F3[Record Error 3]

            C1_S1 --> C1_SETTLE
            C1_F1 --> C1_SETTLE
            C1_S2 --> C1_SETTLE
            C1_F2 --> C1_SETTLE
            C1_S3 --> C1_SETTLE
            C1_F3 --> C1_SETTLE[Promise.allSettled]
        end

        C1_SETTLE --> C1_DONE[Emit: workflow:step:completed<br/>chunkSuccess, chunkFailed]
        C1_DONE --> CHUNK2{More Chunks?}
        CHUNK2 -->|Yes| NEXT_CHUNK[Process Next Chunk]
        NEXT_CHUNK --> C1_START
        CHUNK2 -->|No| METRICS_STEP
    end

    subgraph "Final Step: update-batch-metrics"
        METRICS_STEP[Step: update-batch-metrics] --> AGG[Aggregate Results]
        AGG --> TRACK[Track in Analytics]
        TRACK --> FINAL_EMIT[Emit: workflow:completed]
    end

    FINAL_EMIT --> RETURN[Return Batch Result]
    BATCH_ERROR --> END([Workflow End])
    RETURN --> END

    style CHUNK1 fill:#e3f2fd
    style C1_P1 fill:#fff8e1
    style C1_P2 fill:#fff8e1
    style C1_P3 fill:#fff8e1
    style C1_S1 fill:#c8e6c9
    style C1_S2 fill:#c8e6c9
    style C1_S3 fill:#c8e6c9
    style C1_F1 fill:#ffcdd2
    style C1_F2 fill:#ffcdd2
    style C1_F3 fill:#ffcdd2

Crash Recovery Scenario:

sequenceDiagram
    participant WF as BatchWorkflow
    participant CF as Cloudflare Runtime
    participant KV as State Storage

    Note over WF,KV: Normal Execution
    WF->>CF: Start chunk-1
    CF->>KV: Checkpoint: chunk-1 started
    WF->>WF: Process items 1-3
    CF->>KV: Checkpoint: chunk-1 complete

    WF->>CF: Start chunk-2
    CF->>KV: Checkpoint: chunk-2 started

    Note over WF,KV: Crash During chunk-2!
    WF--xWF: Worker crash/timeout

    Note over WF,KV: Automatic Recovery
    CF->>KV: Detect incomplete workflow
    CF->>KV: Load last checkpoint
    KV-->>CF: chunk-2 started (items 4-6)
    CF->>WF: Resume from chunk-2

    WF->>WF: Re-process items 4-6
    CF->>KV: Checkpoint: chunk-2 complete
    WF->>CF: Complete workflow

CacheWarmingWorkflow

Pre-compiles and caches popular filter lists to reduce latency for end users.

flowchart TD
    subgraph "Trigger Sources"
        CRON[Cron: 0 */6 * * *<br/>Every 6 hours]
        MANUAL[Manual: POST /workflow/cache-warm]
    end

    subgraph "Initialization"
        CRON --> START
        MANUAL --> START([CacheWarmingWorkflow])
        START --> PARAMS{Custom Configs<br/>Provided?}
        PARAMS -->|Yes| USE_CUSTOM[Use Custom Configurations]
        PARAMS -->|No| USE_DEFAULT[Use Default Popular Lists]
    end

    subgraph "Default Configurations"
        USE_DEFAULT --> DEFAULT[Default Popular Lists]
        DEFAULT --> D1[EasyList<br/>https://easylist.to/.../easylist.txt]
        DEFAULT --> D2[EasyPrivacy<br/>https://easylist.to/.../easyprivacy.txt]
        DEFAULT --> D3[AdGuard Base<br/>https://filters.adtidy.org/.../filter.txt]
    end

    subgraph "Step 1: check-cache-status"
        USE_CUSTOM --> CHECK
        D1 --> CHECK
        D2 --> CHECK
        D3 --> CHECK
        CHECK[Check Existing Cache Status] --> CHECK_LOOP{For Each Config}
        CHECK_LOOP --> CACHE_CHECK{Cache Fresh?}
        CACHE_CHECK -->|Yes| SKIP[Skip - Already Cached]
        CACHE_CHECK -->|No/Expired| QUEUE[Add to Warming Queue]
        SKIP --> CHECK_LOOP
        QUEUE --> CHECK_LOOP
        CHECK_LOOP -->|Done| CHECK_EMIT[Emit: step:completed<br/>toWarm: N, skipped: M]
    end

    subgraph "Step 2-N: warm-chunk-N"
        CHECK_EMIT --> CHUNK_SPLIT[Split into Chunks<br/>MAX_CONCURRENT = 2]

        CHUNK_SPLIT --> CHUNK1[Chunk 1]
        CHUNK1 --> WARM1[Step: warm-chunk-1]

        WARM1 --> W1_C1[Compile Config 1]
        W1_C1 --> W1_WAIT1[Wait 2s<br/>Be Nice to Upstream]
        W1_WAIT1 --> W1_C2[Compile Config 2]
        W1_C2 --> W1_CACHE[Cache Both Results]
        W1_CACHE --> W1_EMIT[Emit: step:completed]

        W1_EMIT --> CHUNK_WAIT[Wait 10s<br/>Inter-chunk Delay]
        CHUNK_WAIT --> MORE_CHUNKS{More Chunks?}
        MORE_CHUNKS -->|Yes| NEXT_CHUNK[Process Next Chunk]
        NEXT_CHUNK --> WARM1
        MORE_CHUNKS -->|No| METRICS_STEP
    end

    subgraph "Step N+1: update-warming-metrics"
        METRICS_STEP[Update Warming Metrics] --> TRACK[Track Statistics]
        TRACK --> STORE[Store in KV/Analytics]
        STORE --> FINAL_EMIT[Emit: workflow:completed]
    end

    FINAL_EMIT --> RESULT[Return Warming Result]
    RESULT --> END([End])

    style CRON fill:#fff9c4,stroke:#f57c00
    style DEFAULT fill:#e8f5e9
    style CHUNK1 fill:#e3f2fd
    style W1_WAIT1 fill:#f5f5f5
    style CHUNK_WAIT fill:#f5f5f5

Warming Schedule:

gantt
    title Cache Warming Schedule (24-hour cycle)
    dateFormat HH:mm
    axisFormat %H:%M

    section Cron Triggers
    Cache Warm Run 1    :cron1, 00:00, 30m
    Cache Warm Run 2    :cron2, 06:00, 30m
    Cache Warm Run 3    :cron3, 12:00, 30m
    Cache Warm Run 4    :cron4, 18:00, 30m

    section Cache Validity
    EasyList Cache      :active, cache1, 00:00, 24h
    EasyPrivacy Cache   :active, cache2, 00:00, 24h
    AdGuard Cache       :active, cache3, 00:00, 24h

HealthMonitoringWorkflow

Periodically checks availability and validity of upstream filter list sources.

flowchart TD
    subgraph "Trigger Sources"
        CRON[Cron: 0 * * * *<br/>Every hour]
        MANUAL[Manual: POST /workflow/health-check]
        ALERT_RECHECK[Alert-triggered Recheck]
    end

    subgraph "Initialization"
        CRON --> START
        MANUAL --> START
        ALERT_RECHECK --> START([HealthMonitoringWorkflow])
        START --> PARAMS{Custom Sources?}
        PARAMS -->|Yes| USE_CUSTOM[Use Provided Sources]
        PARAMS -->|No| USE_DEFAULT[Use Default Sources]
    end

    subgraph "Default Monitored Sources"
        USE_DEFAULT --> SOURCES[Default Sources]
        SOURCES --> S1[EasyList<br/>Expected: 50,000+ rules]
        SOURCES --> S2[EasyPrivacy<br/>Expected: 10,000+ rules]
        SOURCES --> S3[AdGuard Base<br/>Expected: 30,000+ rules]
        SOURCES --> S4[AdGuard Tracking<br/>Expected: 10,000+ rules]
        SOURCES --> S5[Peter Lowe's List<br/>Expected: 2,000+ rules]
    end

    subgraph "Step 1: load-health-history"
        USE_CUSTOM --> HISTORY
        S1 --> HISTORY
        S2 --> HISTORY
        S3 --> HISTORY
        S4 --> HISTORY
        S5 --> HISTORY
        HISTORY[Load Health History] --> HIST_FETCH[Fetch Last 30 Days]
        HIST_FETCH --> HIST_ANALYZE[Analyze Failure Patterns]
        HIST_ANALYZE --> HIST_EMIT[Emit: step:completed]
    end

    subgraph "Step 2-N: check-source-N"
        HIST_EMIT --> CHECK_LOOP[For Each Source]

        CHECK_LOOP --> CHECK_SRC[Step: check-source-N]
        CHECK_SRC --> EMIT_START[Emit: health:check:started]

        EMIT_START --> HTTP_REQ[HTTP HEAD/GET Request]
        HTTP_REQ --> MEASURE[Measure Response Time]

        MEASURE --> VALIDATE{Validate Response}

        VALIDATE --> V_STATUS{Status 200?}
        V_STATUS -->|No| MARK_UNHEALTHY[Mark Unhealthy<br/>Record Error]
        V_STATUS -->|Yes| V_TIME{Response < 30s?}
        V_TIME -->|No| MARK_SLOW[Mark Unhealthy<br/>Too Slow]
        V_TIME -->|Yes| V_RULES{Rules >= Expected?}
        V_RULES -->|No| MARK_LOW[Mark Unhealthy<br/>Low Rule Count]
        V_RULES -->|Yes| MARK_HEALTHY[Mark Healthy]

        MARK_UNHEALTHY --> RECORD
        MARK_SLOW --> RECORD
        MARK_LOW --> RECORD
        MARK_HEALTHY --> RECORD[Record Result]

        RECORD --> EMIT_DONE[Emit: health:check:completed]
        EMIT_DONE --> DELAY[Sleep 2s]
        DELAY --> MORE_SRC{More Sources?}
        MORE_SRC -->|Yes| CHECK_LOOP
        MORE_SRC -->|No| ANALYZE_STEP
    end

    subgraph "Step N+1: analyze-results"
        ANALYZE_STEP[Analyze All Results] --> CALC[Calculate Statistics]
        CALC --> CHECK_CONSEC{Consecutive<br/>Failures >= 3?}
        CHECK_CONSEC -->|Yes| NEED_ALERT[Flag for Alert]
        CHECK_CONSEC -->|No| NO_ALERT[No Alert Needed]
    end

    subgraph "Step N+2: send-alerts (conditional)"
        NEED_ALERT --> ALERT_CHECK{alertOnFailure?}
        ALERT_CHECK -->|Yes| SEND[Send Alert Notification]
        ALERT_CHECK -->|No| SKIP_ALERT[Skip Alert]
        NO_ALERT --> STORE_STEP
        SEND --> STORE_STEP
        SKIP_ALERT --> STORE_STEP
    end

    subgraph "Step N+3: store-results"
        STORE_STEP[Store Results] --> STORE_KV[Store in KV]
        STORE_KV --> STORE_AE[Track in Analytics]
        STORE_AE --> EMIT_COMPLETE[Emit: workflow:completed]
    end

    EMIT_COMPLETE --> RETURN[Return Health Report]
    RETURN --> END([End])

    style CRON fill:#fff9c4
    style MARK_HEALTHY fill:#c8e6c9
    style MARK_UNHEALTHY fill:#ffcdd2
    style MARK_SLOW fill:#ffcdd2
    style MARK_LOW fill:#ffcdd2
    style NEED_ALERT fill:#ffcdd2

Health Check Response Structure:

classDiagram
    class HealthCheckResult {
        +string runId
        +Date timestamp
        +SourceHealth[] results
        +HealthSummary summary
    }

    class SourceHealth {
        +string name
        +string url
        +boolean healthy
        +number statusCode
        +number responseTimeMs
        +number ruleCount
        +string? error
    }

    class HealthSummary {
        +number total
        +number healthy
        +number unhealthy
        +number avgResponseTimeMs
    }

    class HealthHistory {
        +Date[] timestamps
        +Map~string, boolean[]~ sourceResults
        +number consecutiveFailures
    }

    HealthCheckResult --> SourceHealth
    HealthCheckResult --> HealthSummary
    HealthCheckResult --> HealthHistory

Workflow Events & Progress Tracking

Real-time progress tracking for all workflows using the WorkflowEvents system.

flowchart LR
    subgraph "Workflow Execution"
        WF[Any Workflow] --> EMIT[Event Emitter]
    end

    subgraph "Event Types"
        EMIT --> E1[workflow:started]
        EMIT --> E2[workflow:step:started]
        EMIT --> E3[workflow:step:completed]
        EMIT --> E4[workflow:step:failed]
        EMIT --> E5[workflow:progress]
        EMIT --> E6[workflow:completed]
        EMIT --> E7[workflow:failed]
        EMIT --> E8[source:fetch:started]
        EMIT --> E9[source:fetch:completed]
        EMIT --> E10[transformation:started]
        EMIT --> E11[transformation:completed]
        EMIT --> E12[cache:stored]
        EMIT --> E13[health:check:started]
        EMIT --> E14[health:check:completed]
    end

    subgraph "Event Storage"
        E1 --> KV[(KV: workflow:events:ID)]
        E2 --> KV
        E3 --> KV
        E4 --> KV
        E5 --> KV
        E6 --> KV
        E7 --> KV
        E8 --> KV
        E9 --> KV
        E10 --> KV
        E11 --> KV
        E12 --> KV
        E13 --> KV
        E14 --> KV
    end

    subgraph "Event Retrieval"
        KV --> API[GET /workflow/events/:id]
        API --> CLIENT[Client Polling]
    end

    style E6 fill:#c8e6c9
    style E7 fill:#ffcdd2
    style E4 fill:#ffcdd2

Event Polling Sequence:

sequenceDiagram
    participant Client
    participant API as /workflow/events/:id
    participant KV as Event Storage

    Note over Client,KV: Client starts polling for progress

    Client->>API: GET /workflow/events/wf-123
    API->>KV: Get events for wf-123
    KV-->>API: Events 1-3
    API-->>Client: {progress: 25%, events: [...]}

    Note over Client: Wait 2 seconds

    Client->>API: GET /workflow/events/wf-123?since=timestamp
    API->>KV: Get events since timestamp
    KV-->>API: Events 4-6
    API-->>Client: {progress: 60%, events: [...]}

    Note over Client: Wait 2 seconds

    Client->>API: GET /workflow/events/wf-123?since=timestamp
    API->>KV: Get events since timestamp
    KV-->>API: Events 7-8 (includes completed)
    API-->>Client: {progress: 100%, isComplete: true, events: [...]}

    Note over Client: Stop polling

Event Storage Limits:

ParameterValueNotes
TTL1 hourEvents auto-expire
Max Events100 per workflowOldest truncated
Key Formatworkflow:events:{workflowId}
ConsistencyEventualAcceptable for progress

Queue System Workflows

Async Compilation Flow

Complete end-to-end flow for asynchronous compilation requests.

sequenceDiagram
    participant C as Client
    participant API as Worker API
    participant RL as Rate Limiter
    participant TS as Turnstile
    participant QP as Queue Producer
    participant Q as Cloudflare Queue
    participant QC as Queue Consumer
    participant Compiler as FilterCompiler
    participant KV as KV Cache
    participant Metrics as Metrics Store

    Note over C,Metrics: Async Compilation Request Flow

    C->>API: POST /compile/async
    API->>API: Extract IP & Config
    
    API->>RL: Check Rate Limit
    alt Rate Limit Exceeded
        RL-->>API: Denied
        API-->>C: 429 Too Many Requests
    else Rate Limit OK
        RL-->>API: Allowed
        
        API->>TS: Verify Turnstile Token
        alt Turnstile Failed
            TS-->>API: Invalid
            API-->>C: 403 Forbidden
        else Turnstile OK
            TS-->>API: Valid
            
            API->>API: Generate Request ID
            API->>API: Create Queue Message
            API->>QP: Route by Priority
            
            alt High Priority
                QP->>Q: Send to High Priority Queue
            else Standard Priority
                QP->>Q: Send to Standard Queue
            end
            
            API->>Metrics: Track Enqueued
            API-->>C: 202 Accepted (requestId, priority)
            
            Note over Q,QC: Asynchronous Processing

            Q->>Q: Batch Messages
            Q->>QC: Deliver Message Batch
            
            QC->>QC: Dispatch by Type
            QC->>Compiler: Execute Compilation
            Compiler->>Compiler: Validate Config
            Compiler->>Compiler: Fetch & Compile Sources
            Compiler->>Compiler: Apply Transformations
            Compiler-->>QC: Compiled Rules + Metrics
            
            QC->>QC: Compress Result (gzip)
            QC->>KV: Store Cached Result
            QC->>Metrics: Track Completion
            QC->>Q: ACK Message
            
            Note over C,KV: Result Retrieval (Later)
            
            C->>API: POST /compile (same config)
            API->>KV: Check Cache by Key
            KV-->>API: Cached Result
            API->>API: Decompress Result
            API-->>C: 200 OK (rules, cached: true)
        end
    end

Queue Message Processing

Internal queue consumer flow showing message type dispatch and processing.

flowchart TD
    Start[Queue Consumer: handleQueue] --> BatchReceived{Message Batch Received}
    BatchReceived --> InitStats[Initialize Stats: acked=0, retried=0, unknown=0]
    
    InitStats --> LogBatch[Log: Processing batch of N messages]
    LogBatch --> ProcessLoop[For Each Message in Batch]
    
    ProcessLoop --> ExtractBody[Extract message.body]
    ExtractBody --> LogMessage[Log: Processing message X/N]
    
    LogMessage --> TypeCheck{Switch on message.type}
    
    TypeCheck -->|compile| ProcessCompile[processCompileMessage]
    TypeCheck -->|batch-compile| ProcessBatch[processBatchCompileMessage]
    TypeCheck -->|cache-warm| ProcessWarm[processCacheWarmMessage]
    TypeCheck -->|unknown| LogUnknown[Log: Unknown message type]
    
    ProcessCompile --> TryCompile{Compilation Success?}
    ProcessBatch --> TryBatch{Batch Success?}
    ProcessWarm --> TryWarm{Cache Warm Success?}
    LogUnknown --> AckUnknown[ACK message - unknown++]
    
    TryCompile -->|Success| AckCompile[ACK message - acked++]
    TryCompile -->|Error| RetryCompile[RETRY message - retried++]
    
    TryBatch -->|Success| AckBatch[ACK message - acked++]
    TryBatch -->|Error| RetryBatch[RETRY message - retried++]
    
    TryWarm -->|Success| AckWarm[ACK message - acked++]
    TryWarm -->|Error| RetryWarm[RETRY message - retried++]
    
    AckCompile --> LogComplete[Log: Message completed + duration]
    AckBatch --> LogComplete
    AckWarm --> LogComplete
    AckUnknown --> LogComplete
    RetryCompile --> LogError[Log: Message failed, will retry]
    RetryBatch --> LogError
    RetryWarm --> LogError
    
    LogComplete --> MoreMessages{More Messages?}
    LogError --> MoreMessages
    
    MoreMessages -->|Yes| ProcessLoop
    MoreMessages -->|No| LogBatchStats[Log: Batch statistics]
    
    LogBatchStats --> End[End Queue Processing]
    
    style ProcessCompile fill:#e1f5ff
    style ProcessBatch fill:#e1f5ff
    style ProcessWarm fill:#e1f5ff
    style AckCompile fill:#c8e6c9
    style AckBatch fill:#c8e6c9
    style AckWarm fill:#c8e6c9
    style AckUnknown fill:#fff9c4
    style RetryCompile fill:#ffcdd2
    style RetryBatch fill:#ffcdd2
    style RetryWarm fill:#ffcdd2

Priority Queue Routing

Shows how messages are routed to different queues based on priority level.

flowchart LR
    Client[Client Request] --> API[API Endpoint]
    
    API --> Extract[Extract Priority Field]
    Extract --> DefaultCheck{Priority Specified?}
    
    DefaultCheck -->|No| SetDefault[Set priority = 'standard']
    DefaultCheck -->|Yes| Validate{Validate Priority}
    
    SetDefault --> Route
    Validate -->|Invalid| SetDefault
    Validate -->|Valid| Route[Route Message]
    
    Route --> PriorityCheck{priority === 'high'?}
    
    PriorityCheck -->|Yes| HighQueue[(High Priority Queue)]
    PriorityCheck -->|No| StandardQueue[(Standard Queue)]
    
    HighQueue --> HighConsumer[High Priority Consumer]
    StandardQueue --> StandardConsumer[Standard Consumer]
    
    HighConsumer --> HighConfig[Config: max_batch_size=5<br/>max_batch_timeout=2s]
    StandardConsumer --> StandardConfig[Config: max_batch_size=10<br/>max_batch_timeout=5s]
    
    HighConfig --> Process[Process Messages]
    StandardConfig --> Process
    
    Process --> Result[Compilation Complete]
    
    style HighQueue fill:#ff9800
    style StandardQueue fill:#4caf50
    style HighConsumer fill:#ffe0b2
    style StandardConsumer fill:#c8e6c9
    style Result fill:#e1f5ff

Batch Processing Flow

Detailed flow showing how batch compilations are processed with chunking.

flowchart TD
    Start[processBatchCompileMessage] --> LogStart[Log: Starting batch of N requests]
    
    LogStart --> InitChunk[Initialize Chunk Processing<br/>chunkSize = 3]
    InitChunk --> SplitChunks[Split requests into chunks]
    
    SplitChunks --> ChunkLoop{For Each Chunk}
    
    ChunkLoop --> LogChunk[Log: Processing chunk X/Y]
    LogChunk --> CreatePromises[Create Promise Array<br/>for Chunk Items]
    
    CreatePromises --> ParallelExec[Promise.allSettled<br/>Execute 3 in Parallel]
    
    ParallelExec --> ProcessItem1[Create CompileQueueMessage<br/>processCompileMessage - Item 1]
    ParallelExec --> ProcessItem2[Create CompileQueueMessage<br/>processCompileMessage - Item 2]
    ParallelExec --> ProcessItem3[Create CompileQueueMessage<br/>processCompileMessage - Item 3]
    
    ProcessItem1 --> Compile1[Compile + Cache]
    ProcessItem2 --> Compile2[Compile + Cache]
    ProcessItem3 --> Compile3[Compile + Cache]
    
    Compile1 --> Settle1{Status}
    Compile2 --> Settle2{Status}
    Compile3 --> Settle3{Status}
    
    Settle1 -->|fulfilled| Success1[successful++]
    Settle1 -->|rejected| Fail1[failed++<br/>Record Error]
    
    Settle2 -->|fulfilled| Success2[successful++]
    Settle2 -->|rejected| Fail2[failed++<br/>Record Error]
    
    Settle3 -->|fulfilled| Success3[successful++]
    Settle3 -->|rejected| Fail3[failed++<br/>Record Error]
    
    Success1 --> ChunkComplete
    Fail1 --> ChunkComplete
    Success2 --> ChunkComplete
    Fail2 --> ChunkComplete
    Success3 --> ChunkComplete
    Fail3 --> ChunkComplete
    
    ChunkComplete[Log: Chunk complete<br/>X/Y successful] --> MoreChunks{More Chunks?}
    
    MoreChunks -->|Yes| ChunkLoop
    MoreChunks -->|No| CheckFailures{Any Failures?}
    
    CheckFailures -->|Yes| LogFailures[Log: Failed items details]
    CheckFailures -->|No| LogSuccess[Log: Batch complete<br/>All successful]
    
    LogFailures --> ThrowError[Throw Error:<br/>Batch partially failed]
    ThrowError --> RetryBatch[Message Will Retry]
    
    LogSuccess --> AckBatch[ACK Message<br/>Batch Complete]
    
    RetryBatch --> End[End]
    AckBatch --> End
    
    style ParallelExec fill:#bbdefb
    style Compile1 fill:#e1f5ff
    style Compile2 fill:#e1f5ff
    style Compile3 fill:#e1f5ff
    style Success1 fill:#c8e6c9
    style Success2 fill:#c8e6c9
    style Success3 fill:#c8e6c9
    style Fail1 fill:#ffcdd2
    style Fail2 fill:#ffcdd2
    style Fail3 fill:#ffcdd2
    style ThrowError fill:#f44336
    style AckBatch fill:#4caf50

Cache Warming Flow

Process for pre-warming the cache with popular filter lists.

flowchart TD
    Start[processCacheWarmMessage] --> Extract[Extract configurations array]
    
    Extract --> LogStart[Log: Starting cache warming<br/>for N configurations]
    
    LogStart --> InitStats[Initialize:<br/>successful=0, failed=0, failures=[]]
    
    InitStats --> ChunkLoop[Process in Chunks of 3]
    
    ChunkLoop --> Chunk1{Chunk 1}
    Chunk1 --> Config1A[Configuration A]
    Chunk1 --> Config1B[Configuration B]
    Chunk1 --> Config1C[Configuration C]
    
    Config1A --> Compile1A[Create CompileQueueMessage<br/>Generate Request ID]
    Config1B --> Compile1B[Create CompileQueueMessage<br/>Generate Request ID]
    Config1C --> Compile1C[Create CompileQueueMessage<br/>Generate Request ID]
    
    Compile1A --> Process1A[processCompileMessage:<br/>Validate, Fetch, Compile]
    Compile1B --> Process1B[processCompileMessage:<br/>Validate, Fetch, Compile]
    Compile1C --> Process1C[processCompileMessage:<br/>Validate, Fetch, Compile]
    
    Process1A --> Cache1A[Cache Result in KV]
    Process1B --> Cache1B[Cache Result in KV]
    Process1C --> Cache1C[Cache Result in KV]
    
    Cache1A --> Result1A{Success?}
    Cache1B --> Result1B{Success?}
    Cache1C --> Result1C{Success?}
    
    Result1A -->|Yes| Inc1A[successful++]
    Result1A -->|No| Fail1A[failed++, Record Error]
    Result1B -->|Yes| Inc1B[successful++]
    Result1B -->|No| Fail1B[failed++, Record Error]
    Result1C -->|Yes| Inc1C[successful++]
    Result1C -->|No| Fail1C[failed++, Record Error]
    
    Inc1A --> ChunkDone
    Fail1A --> ChunkDone
    Inc1B --> ChunkDone
    Fail1B --> ChunkDone
    Inc1C --> ChunkDone
    Fail1C --> ChunkDone
    
    ChunkDone[Log: Chunk complete] --> MoreChunks{More Chunks?}
    
    MoreChunks -->|Yes| ChunkLoop
    MoreChunks -->|No| FinalCheck{Any Failures?}
    
    FinalCheck -->|Yes| LogErrors[Log: Failed configurations<br/>with details]
    FinalCheck -->|No| LogComplete[Log: Cache warming complete<br/>All successful]
    
    LogErrors --> ThrowError[Throw Error:<br/>Partially Failed]
    LogComplete --> Success[Cache Ready for<br/>Future Requests]
    
    ThrowError --> Retry[Message Retried]
    Success --> End[End]
    Retry --> End
    
    style Process1A fill:#e1f5ff
    style Process1B fill:#e1f5ff
    style Process1C fill:#e1f5ff
    style Cache1A fill:#fff9c4
    style Cache1B fill:#fff9c4
    style Cache1C fill:#fff9c4
    style Inc1A fill:#c8e6c9
    style Inc1B fill:#c8e6c9
    style Inc1C fill:#c8e6c9
    style Fail1A fill:#ffcdd2
    style Fail1B fill:#ffcdd2
    style Fail1C fill:#ffcdd2
    style Success fill:#4caf50

Compilation Workflows

Filter Compilation Process

Core compilation flow from configuration to final rules.

flowchart TD
    Start[FilterCompiler.compileWithMetrics] --> InitBenchmark{Benchmark Enabled?}
    
    InitBenchmark -->|Yes| CreateCollector[Create BenchmarkCollector]
    InitBenchmark -->|No| NoBenchmark[collector = null]
    
    CreateCollector --> StartTrace
    NoBenchmark --> StartTrace[Start Tracing: compileFilterList]
    
    StartTrace --> ValidateConfig[Validate Configuration]
    ValidateConfig --> ValidationCheck{Valid?}
    
    ValidationCheck -->|No| LogValidationError[Emit operationError<br/>Log Error]
    ValidationCheck -->|Yes| TraceValidation[Emit operationComplete<br/>valid: true]
    
    LogValidationError --> ThrowError[Throw ConfigurationError]
    
    TraceValidation --> LogConfig[Log Configuration JSON]
    LogConfig --> ExtractSources[Extract configuration.sources]
    
    ExtractSources --> StartSourceTrace[Start Tracing: compileSources]
    StartSourceTrace --> ParallelSources[Promise.all: Compile Sources in Parallel]
    
    ParallelSources --> Source1[SourceCompiler.compile<br/>Source 0 of N]
    ParallelSources --> Source2[SourceCompiler.compile<br/>Source 1 of N]
    ParallelSources --> Source3[SourceCompiler.compile<br/>Source N-1 of N]
    
    Source1 --> Rules1[rules: string[]]
    Source2 --> Rules2[rules: string[]]
    Source3 --> Rules3[rules: string[]]
    
    Rules1 --> CompleteTrace
    Rules2 --> CompleteTrace
    Rules3 --> CompleteTrace[Emit operationComplete<br/>totalRules count]
    
    CompleteTrace --> CombineResults[Combine Source Results<br/>Maintain Order]
    
    CombineResults --> AddHeaders[Add Source Headers]
    AddHeaders --> ApplyTransforms[Apply Transformations]
    
    ApplyTransforms --> Transform1[Transformation 1]
    Transform1 --> Transform2[Transformation 2]
    Transform2 --> TransformN[Transformation N]
    
    TransformN --> CompleteCompilation[Emit operationComplete:<br/>compileFilterList]
    
    CompleteCompilation --> GenerateHeader[Generate List Header]
    GenerateHeader --> AddChecksum[Add Checksum to Header]
    
    AddChecksum --> FinalRules[Combine: Header + Rules]
    FinalRules --> CollectMetrics{Benchmark?}
    
    CollectMetrics -->|Yes| StopCollector[collector.stop<br/>Gather Metrics]
    CollectMetrics -->|No| NoMetrics[metrics = undefined]
    
    StopCollector --> ReturnResult
    NoMetrics --> ReturnResult[Return: CompilationResult<br/>rules, metrics, diagnostics]
    
    ReturnResult --> End[End]
    ThrowError --> End
    
    style ParallelSources fill:#bbdefb
    style Source1 fill:#e1f5ff
    style Source2 fill:#e1f5ff
    style Source3 fill:#e1f5ff
    style ApplyTransforms fill:#fff9c4
    style ReturnResult fill:#c8e6c9
    style ThrowError fill:#ffcdd2

Source Compilation

Individual source processing within the compiler.

sequenceDiagram
    participant FC as FilterCompiler
    participant SC as SourceCompiler
    participant FD as FilterDownloader
    participant Pipeline as TransformationPipeline
    participant Trace as TracingContext
    participant Events as EventEmitter

    FC->>SC: compile(source, index, totalSources)
    SC->>Trace: operationStart('compileSource')
    SC->>Events: onProgress('Downloading...')
    
    SC->>FD: download(source.source)
    FD->>FD: Fetch URL / Use Pre-fetched
    
    alt Download Failed
        FD-->>SC: throw DownloadError
        SC->>Trace: operationError(error)
        SC->>Events: onSourceError(error)
        SC-->>FC: throw error
    else Download Success
        FD-->>SC: rules: string[]
        SC->>Trace: operationComplete(download)
        SC->>Events: onSourceComplete
        
        SC->>Events: onProgress('Applying transformations...')
        SC->>Pipeline: applyAll(rules, source.transformations)
        
        loop For Each Transformation
            Pipeline->>Pipeline: Apply Transformation
            Pipeline->>Events: onTransformationApplied
        end
        
        Pipeline-->>SC: transformed rules
        SC->>Trace: operationComplete('compileSource')
        SC-->>FC: rules: string[]
    end

Transformation Pipeline

The transformation pipeline applies a series of rule transformations in a fixed order.

flowchart TD
    subgraph "Input"
        INPUT[Raw Rules Array<br/>from Source Fetch]
    end

    subgraph "Pre-Processing"
        INPUT --> EXCLUSIONS{Has Exclusion<br/>Patterns?}
        EXCLUSIONS -->|Yes| APPLY_EXCL[Apply Exclusions<br/>Remove matching rules]
        EXCLUSIONS -->|No| INCLUSIONS
        APPLY_EXCL --> INCLUSIONS{Has Inclusion<br/>Patterns?}
        INCLUSIONS -->|Yes| APPLY_INCL[Apply Inclusions<br/>Keep only matching rules]
        INCLUSIONS -->|No| TRANSFORM_START
        APPLY_INCL --> TRANSFORM_START[Start Transformation Pipeline]
    end

    subgraph "Transformation Pipeline (Fixed Order)"
        TRANSFORM_START --> T1[1. ConvertToAscii<br/>Non-ASCII → Punycode]
        T1 --> T2[2. TrimLines<br/>Remove whitespace]
        T2 --> T3[3. RemoveComments<br/>Remove ! and # lines]
        T3 --> T4[4. Compress<br/>Hosts → Adblock syntax]
        T4 --> T5[5. RemoveModifiers<br/>Strip unsupported modifiers]
        T5 --> T6[6. InvertAllow<br/>@@ → blocking rules]
        T6 --> T7[7. Validate<br/>Remove dangerous rules]
        T7 --> T8[8. ValidateAllowIp<br/>Validate preserving IPs]
        T8 --> T9[9. Deduplicate<br/>Remove duplicate rules]
        T9 --> T10[10. RemoveEmptyLines<br/>Remove blank lines]
        T10 --> T11[11. InsertFinalNewLine<br/>Add trailing newline]
    end

    subgraph "Output"
        T11 --> OUTPUT[Transformed Rules Array]
    end

    style T1 fill:#e3f2fd
    style T2 fill:#e3f2fd
    style T3 fill:#e3f2fd
    style T4 fill:#fff8e1
    style T5 fill:#fff8e1
    style T6 fill:#fff8e1
    style T7 fill:#fce4ec
    style T8 fill:#fce4ec
    style T9 fill:#e8f5e9
    style T10 fill:#e8f5e9
    style T11 fill:#e8f5e9

Transformation Details:

flowchart LR
    subgraph "Text Processing"
        T1[ConvertToAscii]
        T2[TrimLines]
        T3[RemoveComments]
    end

    subgraph "Format Conversion"
        T4[Compress]
        T5[RemoveModifiers]
        T6[InvertAllow]
    end

    subgraph "Validation"
        T7[Validate]
        T8[ValidateAllowIp]
    end

    subgraph "Cleanup"
        T9[Deduplicate]
        T10[RemoveEmptyLines]
        T11[InsertFinalNewLine]
    end

    T1 --> T2 --> T3 --> T4 --> T5 --> T6 --> T7 --> T8 --> T9 --> T10 --> T11
TransformationPurposeExample
ConvertToAsciiPunycode encodingädblock.comxn--dblock-bua.com
TrimLinesClean whitespace rule rule
RemoveCommentsStrip comments! Comment → (removed)
CompressHosts to adblock0.0.0.0 ads.com → `
RemoveModifiersStrip modifiers`
InvertAllowConvert exceptions`@@
ValidateRemove dangerous`
ValidateAllowIpValidate + IPsKeep 127.0.0.1 rules
DeduplicateRemove duplicates`
RemoveEmptyLinesClean blanks(blank lines removed)
InsertFinalNewLineAdd newlineEnsure file ends with \n

Pattern Matching Optimization:

flowchart TD
    subgraph "Pattern Classification"
        PATTERN[Exclusion/Inclusion Pattern] --> CHECK{Contains Wildcard?}
        CHECK -->|No| PLAIN[Plain String Pattern]
        CHECK -->|Yes| REGEX[Wildcard Pattern]
    end

    subgraph "Plain String Matching"
        PLAIN --> INCLUDES[String.includes]
        INCLUDES --> FAST[O(n) per rule<br/>Very Fast]
    end

    subgraph "Wildcard Pattern Matching"
        REGEX --> COMPILE[Compile to Regex]
        COMPILE --> WILDCARDS[* → .*<br/>? → .]
        WILDCARDS --> MATCH[RegExp.test]
        MATCH --> SLOWER[O(n) with regex overhead]
    end

    subgraph "Optimization"
        FAST --> SET[Use Set for O(1) lookups<br/>when checking requested transformations]
        SLOWER --> SET
    end

    style PLAIN fill:#c8e6c9
    style REGEX fill:#fff9c4
    style SET fill:#e1f5ff

Request Deduplication

In-flight request deduplication using cache keys.

flowchart TD
    Start[Incoming Request] --> ExtractConfig[Extract Configuration]
    
    ExtractConfig --> HasPreFetch{Has Pre-fetched<br/>Content?}
    
    HasPreFetch -->|Yes| BypassDedup[Skip Deduplication<br/>No Cache Key]
    HasPreFetch -->|No| GenerateKey[Generate Cache Key<br/>getCacheKey]
    
    GenerateKey --> NormalizeConfig[Normalize Config:<br/>Sort Keys, JSON.stringify]
    NormalizeConfig --> HashConfig[Hash String<br/>hashString]
    HashConfig --> CreateKey[cache:HASH]
    
    CreateKey --> CheckPending{Pending Request<br/>Exists?}
    
    CheckPending -->|Yes| WaitPending[Wait for Existing<br/>Promise to Resolve]
    CheckPending -->|No| CheckCache{Check KV Cache}
    
    WaitPending --> GetResult[Get Shared Result]
    GetResult --> ReturnCached[Return Cached Result]
    
    CheckCache -->|Hit| DecompressCache[Decompress gzip]
    CheckCache -->|Miss| AddPending[Add to pendingCompilations Map]
    
    DecompressCache --> ReturnCached
    
    AddPending --> StartCompile[Start New Compilation]
    StartCompile --> DoCompile[Execute Compilation]
    DoCompile --> Compress[Compress Result - gzip]
    Compress --> StoreCache[Store in KV Cache<br/>TTL: CACHE_TTL]
    StoreCache --> RemovePending[Remove from pendingCompilations]
    RemovePending --> ReturnResult[Return Fresh Result]
    
    BypassDedup --> DoCompile
    ReturnResult --> End[End]
    ReturnCached --> End
    
    style CheckPending fill:#fff9c4
    style WaitPending fill:#ffe0b2
    style AddPending fill:#e1f5ff
    style ReturnCached fill:#c8e6c9
    style ReturnResult fill:#c8e6c9

Supporting Processes

Rate Limiting

Rate limiting check for incoming requests.

flowchart TD
    Start[checkRateLimit] --> ExtractIP[Extract Client IP]
    
    ExtractIP --> CreateKey[Create Key:<br/>ratelimit:IP]
    CreateKey --> GetCurrent[Get Current Count from KV]
    
    GetCurrent --> CheckData{Data Exists?}
    
    CheckData -->|No| FirstRequest[First Request or Expired]
    CheckData -->|Yes| CheckExpired{now > resetAt?}
    
    CheckExpired -->|Yes| WindowExpired[Window Expired]
    CheckExpired -->|No| CheckLimit{count >= MAX_REQUESTS?}
    
    FirstRequest --> StartWindow[Create New Window:<br/>count=1, resetAt=now+WINDOW]
    WindowExpired --> StartWindow
    
    StartWindow --> StoreNew[Store in KV<br/>TTL: WINDOW + 10s]
    StoreNew --> AllowRequest[Return: true - Allow]
    
    CheckLimit -->|Yes| DenyRequest[Return: false - Deny]
    CheckLimit -->|No| IncrementCount[Increment count++]
    
    IncrementCount --> UpdateKV[Update KV:<br/>Same resetAt, New count]
    UpdateKV --> AllowRequest
    
    AllowRequest --> End[End]
    DenyRequest --> End
    
    style AllowRequest fill:#c8e6c9
    style DenyRequest fill:#ffcdd2
    style StartWindow fill:#e1f5ff

Caching Strategy

Comprehensive caching flow with compression.

flowchart LR
    subgraph "Write Path"
        CompileComplete[Compilation Complete] --> CreateResult[Create CompilationResult:<br/>success, rules, ruleCount, metrics, compiledAt]
        CreateResult --> MeasureSize[Measure Uncompressed Size]
        MeasureSize --> Compress[Compress with gzip]
        Compress --> MeasureCompressed[Measure Compressed Size]
        MeasureCompressed --> CalcRatio[Calculate Compression Ratio:<br/>70-80% typical]
        CalcRatio --> StoreKV[Store in KV:<br/>Key: cache:HASH<br/>TTL: 3600s]
        StoreKV --> LogCache[Log: Cache stored<br/>Size & Compression]
    end
    
    subgraph "Read Path"
        Request[Incoming Request] --> GenerateKey[Generate Cache Key]
        GenerateKey --> LookupKV[Lookup in KV]
        LookupKV --> Found{Found?}
        Found -->|No| CacheMiss[Cache Miss]
        Found -->|Yes| ReadCompressed[Read Compressed Data]
        ReadCompressed --> Decompress[Decompress gzip]
        Decompress --> ParseJSON[Parse JSON]
        ParseJSON --> ReturnCached[Return Result<br/>cached: true]
        CacheMiss --> CompileNew[Start New Compilation]
    end
    
    LogCache -.->|Later Request| Request
    
    style Compress fill:#fff9c4
    style StoreKV fill:#e1f5ff
    style ReturnCached fill:#c8e6c9
    style CacheMiss fill:#ffcdd2

Error Handling & Retry

Queue message retry strategy with exponential backoff.

stateDiagram-v2
    [*] --> Enqueued: Message Sent to Queue
    
    Enqueued --> Batched: Queue Batching
    Batched --> Processing: Consumer Receives
    
    Processing --> Validating: Extract & Validate
    
    Validating --> Compiling: Valid Message
    Validating --> UnknownType: Unknown Type
    
    UnknownType --> Acknowledged: ACK (Prevent Loop)
    Acknowledged --> [*]
    
    Compiling --> CachingResult: Compilation Success
    Compiling --> Error: Compilation Failed
    
    CachingResult --> Acknowledged: ACK Success
    
    Error --> Retry1: 1st Retry (Backoff: 2s)
    Retry1 --> Compiling
    
    Retry1 --> Retry2: Still Failed
    Retry2 --> Compiling: 2nd Retry (Backoff: 4s)
    
    Retry2 --> Retry3: Still Failed
    Retry3 --> Compiling: 3rd Retry (Backoff: 8s)
    
    Retry3 --> RetryN: Still Failed
    RetryN --> Compiling: Nth Retry (Backoff: 2^n s)
    
    RetryN --> DeadLetterQueue: Max Retries Exceeded
    DeadLetterQueue --> [*]: Manual Investigation
    
    note right of Error
        Retries triggered by:
        - Network failures
        - Source download errors
        - Compilation errors
        - KV storage errors
    end note
    
    note right of Acknowledged
        Success metrics tracked:
        - Request ID
        - Config name
        - Rule count
        - Duration
        - Cache key
    end note

Queue Statistics & Monitoring

Queue statistics tracking for observability.

flowchart TD
    subgraph "Statistics Tracked"
        Enqueued[Enqueued Count]
        Completed[Completed Count]
        Failed[Failed Count]
        Processing[Processing Count]
    end
    
    subgraph "Per Job Metadata"
        RequestID[Request ID]
        ConfigName[Config Name]
        RuleCount[Rule Count]
        Duration[Duration ms]
        CacheKey[Cache Key]
        Error[Error Message]
    end
    
    subgraph "Storage"
        MetricsKV[(Metrics KV Store)]
        Logs[Console Logs]
        TailWorker[Tail Worker Events]
    end
    
    Enqueued --> MetricsKV
    Completed --> MetricsKV
    Failed --> MetricsKV
    Processing --> MetricsKV
    
    RequestID --> Logs
    ConfigName --> Logs
    RuleCount --> Logs
    Duration --> Logs
    CacheKey --> Logs
    Error --> Logs
    
    Logs --> TailWorker
    MetricsKV --> Dashboard[Cloudflare Dashboard]
    TailWorker --> ExternalMonitoring[External Monitoring<br/>Datadog, Splunk, etc.]
    
    style MetricsKV fill:#e1f5ff
    style Logs fill:#fff9c4
    style TailWorker fill:#ffe0b2

Message Type Reference

Quick reference for the three queue message types:

Message TypePurposeProcessingChunking
compileSingle compilation jobDirect compilation → cacheN/A
batch-compileMultiple compilationsParallel chunks of 3Yes (3 items)
cache-warmPre-compile popular listsParallel chunks of 3Yes (3 items)

Priority Level Comparison

PriorityQueuemax_batch_sizemax_batch_timeoutUse Case
standardadblock-compiler-worker-queue105sBatch operations, scheduled jobs
highadblock-compiler-worker-queue-high-priority52sPremium users, urgent requests

Notes

  • All queue processing is asynchronous and non-blocking
  • Parallel processing is limited to chunks of 3 to prevent resource exhaustion
  • Cache TTL is 1 hour (3600s) by default
  • Compression typically achieves 70-80% size reduction
  • Rate limiting window is 60 seconds with max 10 requests per IP
  • All operations include comprehensive logging with structured prefixes
  • Diagnostic events are emitted to tail worker for centralized monitoring
  • Error recovery uses exponential backoff with automatic retry
  • Unknown message types are acknowledged to prevent infinite retry loops

Workflow Improvements Summary

This document provides a quick overview of the improvements made to GitHub Actions workflows.

Executive Summary

The workflows have been rewritten to:

  • Run 40-50% faster through parallelization
  • Fail faster with early validation
  • Use resources more efficiently with better caching
  • Be more maintainable with clearer structure
  • Follow best practices with proper gating and permissions

CI Workflow Improvements (Round 2)

Eight additional enhancements landed in PR #788:

Before → After Comparison

AspectBeforeAfterImprovement
deno install12-line retry block duplicated in 5 jobsComposite action .github/actions/deno-installNo duplication
Worker build on PRsNot verified until deploy to mainverify-deploy dry-run on every PRCatch failures before merge
Frontend jobsTwo separate jobs (frontend + frontend-build)Single frontend-build jobOne pnpm install per run
pnpm lockfile--no-frozen-lockfile (silent drift)--frozen-lockfile (fails on drift)Enforced consistency
Coverage uploadMain push onlyPRs and main pushCoverage visible on PRs
Action versionsFloating tags (@v4)Full commit SHAs + commentsSupply-chain hardened
Migration errors|| echo "already applied or failed" silenced real errorsrun_migration() function parses outputReal errors fail the step
Dead codedetect-changes job (always returned true)RemovedCleaner pipeline

New Job: verify-deploy

Runs a Cloudflare Worker build dry-run on every pull request:

# Runs on PRs only — uses the frontend artifact from frontend-build
verify-deploy:
    needs: [frontend-build]
    if: github.event_name == 'pull_request'
    steps:
        - uses: ./.github/actions/deno-install
        - run: deno task wrangler:verify

The ci-gate job includes verify-deploy in its needs list, so a failing Worker build blocks merge.

Composite Action: deno-install

Extracted the 3-attempt deno install retry loop into a reusable composite action:

# .github/actions/deno-install/action.yml
steps:
    - name: Install dependencies
      env:
          DENO_TLS_CA_STORE: system
      run: |
          for i in 1 2 3; do
              deno install && break
              if [ "$i" -lt 3 ]; then
                  echo "Attempt $i failed, retrying in 10s..."
                  sleep 10
              else
                  echo "All 3 attempts failed."
                  exit 1
              fi
          done

CI Workflow Improvements (Round 1)

Before → After Comparison

AspectBeforeAfterImprovement
Structure1 monolithic job + separate jobs5 parallel jobs + gated sequential jobsBetter parallelization
Runtime~5-7 minutes~2-3 minutes40-50% faster
Type Checking2 files onlyAll entry pointsMore comprehensive
CachingBasic (deno.json only)Advanced (deno.json + deno.lock)More precise
Deployment2 separate jobs1 combined jobSimpler
GatingSecurity runs independentlyAll checks gate publish/deployMore reliable

Key Changes

# BEFORE: Sequential execution in single job
jobs:
  ci:
    steps:
      - Lint
      - Format
      - Type Check
      - Test
  security: # Runs independently
  publish: # Only depends on ci
  deploy-worker: # Depends on ci + security
  deploy-pages: # Depends on ci + security

# AFTER: Parallel execution with proper gating
jobs:
  lint:        # \
  format:      #  |-- Run in parallel
  typecheck:   #  |
  test:        #  |
  security:    # /
  publish:     # Depends on ALL above
  deploy:      # Depends on ALL above (combined worker + pages)

Release Workflow Improvements

Before → After Comparison

AspectBeforeAfterImprovement
ValidationNoneFull CI before buildsFail fast
Binary CachingNo per-target cachePer-target + OS cacheFaster builds
Asset PrepComplex loopSimple find commandCleaner code
CommentsVerbose warningsConcise, essential onlyMore readable

Key Changes

# BEFORE: Build immediately, might fail late
jobs:
  build-binaries:
    # Starts building right away
  build-docker:
    # Builds without validation

# AFTER: Validate first, then build
jobs:
  validate:
    # Run lint, format, typecheck, test
  build-binaries:
    needs: validate  # Only run after validation
  build-docker:
    needs: validate  # Only run after validation

Version Bump Workflow Improvements

Before → After Comparison

AspectBeforeAfterImprovement
TriggerAuto on PR + ManualManual onlyLess disruptive
Files Updated9 files (including examples)4 core files onlyFocused
Error Handlingif/elif chaincase statementMore robust
ValidationNoneVerification stepMore reliable
Git OperationsAdd all filesSelective addSafer

Key Changes

# BEFORE: Automatic trigger
on:
  pull_request:
    types: [opened]  # Auto-runs on every PR!
  workflow_dispatch:

# AFTER: Manual only
on:
  workflow_dispatch:  # Only runs when explicitly triggered

Performance Impact

CI Workflow

Before (~8-10 minutes total):

flowchart LR
    subgraph SEQ["CI Job (sequential) — 5-7 min"]
        L[Lint<br/>1 min] --> F[Format<br/>1 min] --> TC[Type Check<br/>1 min] --> T[Test<br/>2-4 min]
    end
    SEC[Security<br/>2 min]
    T --> PUB[Publish<br/>1 min]
    SEC --> PUB
    PUB --> DW[Deploy Worker<br/>1 min]
    DW --> DP[Deploy Pages<br/>1 min]

After (~4-6 minutes total, 40-50% improvement):

flowchart LR
    subgraph PAR["Parallel Phase — 2-4 min"]
        L[Lint<br/>1 min]
        F[Format<br/>1 min]
        TC[Type Check<br/>1 min]
        T[Test<br/>2-4 min]
        SEC[Security<br/>2 min]
    end
    L --> PUB[Publish<br/>1 min]
    F --> PUB
    TC --> PUB
    T --> PUB
    SEC --> PUB
    PUB --> DEP[Deploy<br/>1 min]

Release Workflow

Before (on failure, ~15 minutes wasted):

flowchart LR
    BB[Build Binaries<br/>10 min] --> BD[Build Docker<br/>5 min] --> CR[Create Release<br/>❌ fails here]

After (on failure, ~3 minutes wasted — 80% improvement):

flowchart LR
    V[Validate<br/>❌ fails here<br/>3 min]

Caching Strategy

Before

key: deno-${{ runner.os }}-${{ hashFiles('deno.json') }}
restore-keys: deno-${{ runner.os }}-

After

key: deno-${{ runner.os }}-${{ hashFiles('deno.json', 'deno.lock') }}
restore-keys: |
    deno-${{ runner.os }}-

Benefits:

  • More precise cache invalidation (includes lock file)
  • Better restore key strategy
  • Per-target caching for binaries

Best Practices Implemented

Principle of Least Privilege: Minimal permissions per job ✅ Fail Fast: Validate before expensive operations ✅ Parallelization: Independent tasks run concurrently ✅ Proper Gating: Critical jobs depend on quality checks ✅ Concurrency Control: Cancel outdated runs automatically ✅ Idempotency: Workflows can be safely re-run ✅ Clear Naming: Job names clearly indicate purpose ✅ Efficient Caching: Smart cache keys and restore strategies ✅ Supply-Chain Hardening: Third-party actions pinned to full commit SHAs ✅ DRY Composite Actions: Shared retry logic extracted to .github/actions/PR Build Verification: Worker dry-run validates deployability on every PR

Breaking Changes

⚠️ Version Bump Workflow

  • No longer triggers automatically on PR open
  • Must be run manually via workflow_dispatch
  • No longer updates example files

Migration Guide

For Contributors

Before: Version was auto-bumped on PR creation After: Manually run "Version Bump" workflow when needed

For Maintainers

Before:

  1. Merge PR → Auto publish → Manual tag → Release

After:

  1. Merge PR → Auto publish
  2. Run "Version Bump" workflow
  3. Tag created → Release triggered

OR

  1. Merge PR → Auto publish
  2. Run "Version Bump" with "Create release" checked
  3. Done!

Monitoring

Success Metrics

Track these to measure improvement:

  • ✅ Average CI runtime (target: <5 min)
  • ✅ Success rate on first run (target: >90%)
  • ✅ Time to failure (target: <3 min)
  • ✅ Cache hit rate (target: >80%)

What to Watch

  • Long test runs: If tests exceed 5 minutes, consider parallelization
  • Cache misses: If cache hit rate drops, check lock file stability
  • Build failures: ARM64 builds might need cross-compilation setup

Future Optimizations

Potential improvements for consideration:

  1. Test Parallelization: Split tests by module
  2. Selective Testing: Only test changed modules on PRs
  3. Artifact Caching: Cache build artifacts between jobs
  4. Matrix Testing: Test on multiple Deno versions
  5. Scheduled Scans: Weekly security scans instead of every commit

Conclusion

These workflow improvements provide:

  • Faster feedback for developers
  • More reliable deployments
  • Better resource utilization
  • Clearer structure for maintenance

The changes maintain backward compatibility while significantly improving performance and reliability.

Workflow Cleanup Summary

Overview

This document summarizes the workflow cleanup performed to simplify the CI/CD pipeline and reduce complexity.

Changes Made

Workflows Removed (8 files)

AI Agent Workflows (6 files)

These workflows relied on the external Warp Oz Agent service and added significant complexity:

  1. auto-fix-issue.yml - AI agent for automatically fixing issues labeled with oz-agent
  2. daily-issue-summary.yml - AI-generated daily issue summaries posted to Slack
  3. fix-failing-checks.yml - AI agent for automatically fixing failing CI checks
  4. respond-to-comment.yml - AI assistant responding to @oz-agent mentions in PR comments
  5. review-pr.yml - AI-powered automated code review for PRs
  6. suggest-review-fixes.yml - AI-powered suggestions for review comment fixes

Rationale for removal:

  • External dependency on Warp Oz Agent service
  • Added complexity to the workflow structure
  • Not essential for core project functionality
  • Can be re-added in the future if needed

Version Bump Workflows (2 files consolidated)

These workflows had overlapping functionality:

  1. auto-version-bump.yml - Automatic version bumping based on conventional commits
  2. version-bump.yml (old) - Manual version bumping

Consolidation:

  • Merged both workflows into a single version-bump.yml that supports:
    • Automatic version detection from conventional commits
    • Manual version bump specification
    • Changelog generation
    • PR-based workflow

Workflows Kept (4 files)

  1. ci.yml - Main CI/CD pipeline

    • Linting, formatting, type checking
    • Testing with coverage
    • Security scanning
    • Publishing to JSR
    • Cloudflare deployment (optional)
  2. version-bump.yml (new) - Consolidated version management

    • Auto-detects version bumps from conventional commits
    • Supports manual version specification
    • Generates changelog entries
    • Creates version bump PRs
  3. create-version-tag.yml - Automatic tag creation

    • Creates release tags when version bump PRs are merged
    • Triggers release workflow
  4. release.yml - Release builds and publishing

    • Multi-platform binary builds
    • Docker image builds
    • GitHub release creation

Impact

Quantitative Changes

  • Before: 12 workflows
  • After: 4 workflows
  • Reduction: 67% (8 files removed)

Qualitative Improvements

Simplified CI/CD Pipeline

  • Fewer workflows to understand and maintain
  • Clearer workflow dependencies
  • Easier onboarding for new contributors

Reduced External Dependencies

  • No longer requires Warp Oz Agent API key
  • No longer requires Slack webhook for issue summaries
  • Self-contained CI/CD pipeline

Better Maintainability

  • Single workflow for version management (instead of two)
  • Consolidated logic reduces duplication
  • Easier to debug and troubleshoot

Preserved Functionality

  • All essential CI/CD features retained
  • Version bumping still supports conventional commits
  • Release process unchanged

Migration Guide

For Contributors

Version Bumping:

  • No action required - automatic version bumping still works via conventional commits
  • Use proper commit message format: feat:, fix:, perf:, etc.
  • For manual bumps: Go to Actions → Version Bump → Run workflow

No More AI Agent Features:

  • Can no longer use @oz-agent in PR comments
  • Can no longer label issues with oz-agent for auto-fixing
  • No more automated PR reviews from AI agent

For Maintainers

Secrets No Longer Required:

  • WARP_API_KEY - Can be removed
  • SLACK_WEBHOOK_URL - Can be removed (if not used elsewhere)
  • WARP_AGENT_PROFILE - Repository variable can be removed

Secrets Still Required:

  • CODECOV_TOKEN - Optional for code coverage reports
  • CLOUDFLARE_API_TOKEN - Required for Cloudflare deployments
  • CLOUDFLARE_ACCOUNT_ID - Required for Cloudflare deployments

Repository Variables Still Required:

  • ENABLE_CLOUDFLARE_DEPLOY - Set to 'true' to enable deployments

Documentation Updates

The following documentation files were updated during the workflow cleanup:

  1. .github/workflows/README.md - Complete rewrite to reflect new workflow structure
  2. .github/WORKFLOWS.md (now at docs/WORKFLOWS.md) - Updated to remove AI agent references and consolidate version bump info
  3. docs/AUTO_VERSION_BUMP.md - Updated to reference consolidated version-bump.yml workflow

Testing Recommendations

Before merging these changes, test the following:

  1. YAML Syntax: All workflow files have valid YAML syntax
  2. CI Workflow: Test that CI runs properly on PRs
  3. Version Bump: Test automatic version bump on push to main
  4. Manual Version Bump: Test manual version bump via workflow dispatch
  5. Tag Creation: Test that tags are created after version bump PR merge
  6. Release: Test that releases are triggered by tags

Rollback Plan

If issues arise, the old workflows can be restored from git history:

# Get commit hash before cleanup
git log --oneline --all | grep "before cleanup"

# Restore old workflows
git checkout <commit-hash> -- .github/workflows/

Future Considerations

Potential Additions

  • Scheduled security scans (weekly)
  • Dependency update automation (Dependabot or similar)
  • Performance regression testing
  • Automated changelog generation improvements
  • Re-adding AI agent workflows without careful consideration
  • Adding more external service dependencies
  • Creating overlapping workflows with similar functionality

Conclusion

This cleanup significantly simplifies the CI/CD pipeline while maintaining all essential functionality. The reduction from 12 to 4 workflows makes the project more maintainable and easier to understand for contributors.

The consolidated version bump workflow combines the best features of both automatic and manual approaches, providing flexibility while reducing duplication.


Date: 2026-02-20 Author: GitHub Copilot Related PR: Clean up all workflow and CI actions