feat: add prompt-guard honeypot for prompt injection detection

- New src/prompt_guard/ package with pydantic-ai Agent + 7 fake tools
  (read_file, write_file, list_directory, execute_shell, make_http_request,
  send_email, query_database) that return plausible but harmless responses
- Injection detected when the model makes any tool call; content is blocked
  entirely (never returned to caller), all calls logged at WARNING level
- Config via PROMPT_GUARD_* env vars (pydantic-settings); system prompt
  deliberately encourages tool use to maximise detection sensitivity
- server.py: SEARXNG_GUARD_ENABLED flag (default false) + guard call in
  _fetch_and_extract; blocked content is not stored in the cache
- Fix Settings.extra='ignore' on both Settings classes so PROMPT_GUARD_*
  and SEARXNG_* vars don't cause validation errors in the other class
- Fix _build_model: use explicit OpenAIProvider when api_key is set so
  PROMPT_GUARD_API_KEY from .env is honoured (pydantic-settings does not
  populate os.environ, so pydantic-ai's auto-provider couldn't find it)

This commit is contained in:

Hans Aschauer

2026-04-21 19:45:19 +02:00

parent 27e0805898

commit 678e052315

8 changed files with 1602 additions and 56 deletions

									
										1

pyproject.toml
									
										View file
										
				@ -10,6 +10,7 @@ requires-python = ">=3.14"

				dependencies = [

				    "fastmcp>=3.2.4",

				    "httpx>=0.28.1",

				    "pydantic-ai>=0.3.0",

				    "pydantic-settings>=2.13.1",

				    "trafilatura>=2.0.0",

				]

Rows
Columns

feat: add prompt-guard honeypot for prompt injection detection

1 pyproject.toml Unescape Escape View file

1

pyproject.toml

View file