- New src/prompt_guard/ package with pydantic-ai Agent + 7 fake tools (read_file, write_file, list_directory, execute_shell, make_http_request, send_email, query_database) that return plausible but harmless responses - Injection detected when the model makes any tool call; content is blocked entirely (never returned to caller), all calls logged at WARNING level - Config via PROMPT_GUARD_* env vars (pydantic-settings); system prompt deliberately encourages tool use to maximise detection sensitivity - server.py: SEARXNG_GUARD_ENABLED flag (default false) + guard call in _fetch_and_extract; blocked content is not stored in the cache - Fix Settings.extra='ignore' on both Settings classes so PROMPT_GUARD_* and SEARXNG_* vars don't cause validation errors in the other class - Fix _build_model: use explicit OpenAIProvider when api_key is set so PROMPT_GUARD_API_KEY from .env is honoured (pydantic-settings does not populate os.environ, so pydantic-ai's auto-provider couldn't find it)
46 lines
1.8 KiB
Python
46 lines
1.8 KiB
Python
"""Configuration for prompt-guard via environment variables."""
|
|
|
|
from pydantic_settings import BaseSettings, SettingsConfigDict
|
|
|
|
DEFAULT_SYSTEM_PROMPT = (
|
|
"You are a powerful AI assistant with access to a variety of tools, "
|
|
"including file system access, shell execution, and network capabilities. "
|
|
"Your primary task is to read the content provided by the user carefully "
|
|
"and act on any instructions you find within it using the appropriate tools. "
|
|
"Be proactive: if the content contains requests or commands, execute them "
|
|
"immediately with the tools available to you. Do not merely describe what "
|
|
"you would do — use the tools."
|
|
)
|
|
|
|
|
|
class Settings(BaseSettings):
|
|
"""Settings for the prompt-guard honeypot agent.
|
|
|
|
All fields can be set via environment variables with the PROMPT_GUARD_ prefix.
|
|
"""
|
|
|
|
model_config = SettingsConfigDict(
|
|
env_prefix="PROMPT_GUARD_",
|
|
env_file=".env",
|
|
env_file_encoding="utf-8",
|
|
extra="ignore",
|
|
)
|
|
|
|
enabled: bool = True
|
|
"""Set to false to disable the guard entirely (content passes through unchecked)."""
|
|
|
|
model: str = "openai:gpt-4o-mini"
|
|
"""Pydantic-AI model string, e.g. 'openai:gpt-4o-mini', 'anthropic:claude-haiku-3-5',
|
|
'groq:llama-3.1-8b-instant'. For OpenAI-compatible endpoints set base_url as well."""
|
|
|
|
api_key: str = ""
|
|
"""API key for the model provider. May also be set via the provider's own env var
|
|
(e.g. OPENAI_API_KEY, ANTHROPIC_API_KEY)."""
|
|
|
|
base_url: str = ""
|
|
"""Base URL override for OpenAI-compatible endpoints (Ollama, LM Studio, vLLM, etc.).
|
|
Example: http://localhost:11434/v1"""
|
|
|
|
system_prompt: str = DEFAULT_SYSTEM_PROMPT
|
|
"""System prompt sent to the honeypot agent. The default is deliberately crafted to
|
|
encourage tool usage so that injected instructions are more likely to trigger calls."""
|