AgentSea - Unite and Orchestrate AI Agents

Installation

bash

pnpm add @lov3kaizen/agentsea-guardrails

Guard Categories

🛡️

Content Safety

Protect against harmful content

ToxicityPIITopicBias

🔒

Security

Prevent attacks and data leakage

Prompt InjectionJailbreakData Leakage

✅

Validation

Ensure output quality and format

SchemaFormatFactuality

⚙️

Operational

Manage resources and costs

Token BudgetRate LimitCost

Quick Start

typescript

import {
  createGuardrailsEngine,
  ToxicityGuard,
  PIIGuard,
  PromptInjectionGuard,
} from '@lov3kaizen/agentsea-guardrails';

// Create the engine
const engine = createGuardrailsEngine({
  guards: [
    { name: 'toxicity', enabled: true, type: 'input', action: 'block' },
    { name: 'pii', enabled: true, type: 'both', action: 'transform' },
    { name: 'prompt-injection', enabled: true, type: 'input', action: 'block' },
  ],
  failureMode: 'fail-fast',
  defaultAction: 'allow',
});

// Register guards
engine.registerGuard(new ToxicityGuard({ sensitivity: 'medium' }));
engine.registerGuard(new PIIGuard({ types: ['email', 'phone'], maskingStrategy: 'redact' }));
engine.registerGuard(new PromptInjectionGuard({ sensitivity: 'high' }));

// Check input
const result = await engine.checkInput('What is the weather today?', {
  sessionId: 'session-1',
  userId: 'user-1',
});

if (result.passed) {
  console.log('Input is safe');
} else {
  console.log(`Blocked: ${result.message}`);
}

Content Guards

ToxicityGuard

Detects toxic, harmful, or inappropriate content:

typescript

import { ToxicityGuard } from '@lov3kaizen/agentsea-guardrails';

const guard = new ToxicityGuard({
  sensitivity: 'medium', // 'low' | 'medium' | 'high'
  categories: ['hate', 'violence', 'harassment', 'sexual'],
});

PIIGuard

Detects and optionally masks personally identifiable information:

typescript

import { PIIGuard } from '@lov3kaizen/agentsea-guardrails';

const guard = new PIIGuard({
  types: ['email', 'phone', 'ssn', 'creditCard', 'address', 'name'],
  maskingStrategy: 'redact', // 'redact' | 'mask' | 'hash'
  customPatterns: [
    { name: 'employeeId', pattern: /EMP-\d{6}/ },
  ],
});

TopicGuard

Filters content based on allowed/blocked topics:

typescript

import { TopicGuard } from '@lov3kaizen/agentsea-guardrails';

const guard = new TopicGuard({
  allowedTopics: ['technology', 'science', 'general'],
  blockedTopics: ['politics', 'religion'],
  confidenceThreshold: 0.7,
});

BiasGuard

Detects biased language:

typescript

import { BiasGuard } from '@lov3kaizen/agentsea-guardrails';

const guard = new BiasGuard({
  categories: ['gender', 'race', 'religion', 'political'],
  sensitivity: 'medium',
});

Security Guards

PromptInjectionGuard

Detects prompt injection attempts:

typescript

import { PromptInjectionGuard } from '@lov3kaizen/agentsea-guardrails';

const guard = new PromptInjectionGuard({
  sensitivity: 'high',
  customPatterns: [
    /reveal.*system.*prompt/i,
    /ignore.*previous.*instructions/i,
  ],
});

JailbreakGuard

Detects jailbreak attempts (DAN, roleplay attacks):

typescript

import { JailbreakGuard } from '@lov3kaizen/agentsea-guardrails';

const guard = new JailbreakGuard({
  sensitivity: 'high',
});

DataLeakageGuard

Prevents sensitive data from being exposed in outputs:

typescript

import { DataLeakageGuard } from '@lov3kaizen/agentsea-guardrails';

const guard = new DataLeakageGuard({
  patterns: ['apiKey', 'password', 'secret', 'token'],
  customPatterns: [
    { name: 'internalUrl', pattern: /internal\.company\.com/ },
  ],
});

Validation Guards

SchemaGuard

Validates output against a Zod schema:

typescript

import { SchemaGuard } from '@lov3kaizen/agentsea-guardrails';
import { z } from 'zod';

const ResponseSchema = z.object({
  answer: z.string(),
  confidence: z.number().min(0).max(1),
});

const guard = new SchemaGuard({
  schema: ResponseSchema,
});

FormatGuard

Ensures output matches expected format:

typescript

import { FormatGuard } from '@lov3kaizen/agentsea-guardrails';

const guard = new FormatGuard({
  format: 'json', // 'json' | 'xml' | 'markdown' | 'custom'
  customValidator: (content) => content.startsWith('{'),
});

Operational Guards

TokenBudgetGuard

Enforces token limits:

typescript

import { TokenBudgetGuard } from '@lov3kaizen/agentsea-guardrails';

const guard = new TokenBudgetGuard({
  maxTokensPerRequest: 4096,
  maxTokensPerSession: 50000,
  maxTokensPerDay: 1000000,
  warningThreshold: 0.8,
});

RateLimitGuard

Limits request rates:

typescript

import { RateLimitGuard } from '@lov3kaizen/agentsea-guardrails';

const guard = new RateLimitGuard({
  requestsPerMinute: 60,
  requestsPerHour: 1000,
  requestsPerDay: 10000,
});

CostGuard

Tracks and limits costs:

typescript

import { CostGuard } from '@lov3kaizen/agentsea-guardrails';

const guard = new CostGuard({
  maxCostPerRequest: 0.10,
  maxCostPerSession: 5.00,
  maxCostPerDay: 100.00,
  currency: 'USD',
});

Rules Engine

Define policies with JSON rules:

typescript

import { createRulesEngine, type RuleSet } from '@lov3kaizen/agentsea-guardrails';

const rules: RuleSet = {
  id: 'content-policy',
  name: 'Content Policy',
  version: '1.0.0',
  rules: [
    {
      id: 'block-profanity',
      name: 'Block Profanity',
      conditions: [
        { field: 'input', operator: 'matches', value: '\\b(bad|word)\\b' },
      ],
      actions: [
        { type: 'block', params: { reason: 'Profanity detected' } },
      ],
      priority: 100,
      enabled: true,
    },
    {
      id: 'redact-emails',
      name: 'Redact Emails',
      conditions: [
        { field: 'input', operator: 'matches', value: '[a-z]+@[a-z]+\\.[a-z]+' },
      ],
      actions: [
        { type: 'transform', params: { pattern: '...', replacement: '[EMAIL]' } },
      ],
      priority: 80,
      enabled: true,
    },
  ],
};

const engine = createRulesEngine({ defaultAction: 'allow' });
engine.loadRuleSet(rules);

const result = await engine.evaluate({
  input: 'Contact me at user@example.com',
  type: 'input',
  metadata: {},
});

NestJS Integration

typescript

import { Module, Controller, Post, Body } from '@nestjs/common';
import { GuardrailsModule, Guardrailed, BypassGuards } from '@lov3kaizen/agentsea-guardrails/nestjs';
import { z } from 'zod';

@Module({
  imports: [
    GuardrailsModule.forRoot({
      guards: [
        { name: 'toxicity', enabled: true, type: 'input', action: 'block' },
        { name: 'pii', enabled: true, type: 'both', action: 'transform' },
        { name: 'prompt-injection', enabled: true, type: 'input', action: 'block' },
      ],
      failureMode: 'fail-fast',
      defaultAction: 'allow',
    }),
  ],
})
export class AppModule {}

const ResponseSchema = z.object({
  answer: z.string(),
  confidence: z.number(),
});

@Controller('chat')
export class ChatController {
  @Post()
  @Guardrailed({
    input: ['toxicity', 'prompt-injection', 'pii'],
    output: ['pii', 'schema'],
    schema: ResponseSchema,
  })
  async chat(@Body() body: { message: string }) {
    return { answer: '...', confidence: 0.95 };
  }

  @Post('admin')
  @BypassGuards()
  async adminChat(@Body() body: { message: string }) {
    // Bypasses all guardrails
    return { answer: '...' };
  }
}

AgentSea Integration

typescript

import { Agent } from '@lov3kaizen/agentsea-core';
import { GuardrailsMiddleware, GuardedAgent } from '@lov3kaizen/agentsea-guardrails/agentsea';

// Middleware approach
const agent = new Agent({ /* config */ });
agent.use(new GuardrailsMiddleware(guardrailsConfig));

// Wrapper approach
const guardedAgent = new GuardedAgent(agent, guardrailsEngine);
const response = await guardedAgent.run('User message');

Configuration

typescript

interface GuardrailsConfig {
  // Array of guard configurations
  guards: GuardConfig[];

  // How to handle failures
  // - 'fail-fast': Stop on first failure
  // - 'fail-safe': Continue with warnings
  // - 'collect-all': Run all guards, collect results
  failureMode: 'fail-fast' | 'fail-safe' | 'collect-all';

  // Default action when no guard blocks
  defaultAction: 'allow' | 'block' | 'warn';

  // Telemetry settings
  telemetry?: {
    logging?: { enabled: boolean; level: string };
    metrics?: { enabled: boolean; prefix: string };
    tracing?: { enabled: boolean; serviceName: string };
  };
}

Guardrails

Installation

Guard Categories

Content Safety

Security

Validation

Operational

Quick Start

Content Guards

ToxicityGuard

PIIGuard

TopicGuard

BiasGuard

Security Guards

PromptInjectionGuard

JailbreakGuard

DataLeakageGuard

Validation Guards

SchemaGuard

FormatGuard

Operational Guards

TokenBudgetGuard

RateLimitGuard

CostGuard

Rules Engine

NestJS Integration

AgentSea Integration

Configuration

Next Steps