Add vision/multimodal support to chat, allowing users to send images
alongside or instead of text prompts. Images are transmitted and persisted
as base64 data URLs.
Backend:
- Add Images []string to Message struct for persistence
- Add Images []string to GenerateTextRequest with relaxed validation
- Build multimodal user messages using OpenAI SDK content parts
- Pass images through from handlers to client
- Deep-copy Images slice in message cloning
Frontend:
- Add images?: string[] to Message and GenerateTextRequest types
- Add image selection state and file input handler
- Add camera icon button, hidden file input, and image preview strip
- Render images in user message bubbles
- Pass images through to GenerateTextRequest
Tests:
- Add TestSendMessageWithImage for vision model testing
vLLM sends thinking content in a "reasoning" delta field, unlike
DeepSeek which uses "reasoning_content". Check both field names so
thinking blocks render for vLLM-hosted models like qwen3.6-27b-thinking.
Also update client tests to exercise thinking output and skip by default
so they don't run in Drone CI (require live LLM API).