Companion GUI - SHAP Multi-Modal LLM Explainer ============================================== Overview -------- Separate repository (https://github.com/mvishiu11/shap-mllm-explainer) contains a companion, GUI-first application for exploring token-level and audio time-series attributions produced by ``mllm_shap``. The app is intentionally containerized (Docker Compose) to provide a reproducible, production-like environment. Architecture ------------ The stack is composed of: - **Frontend**: React (Vite build), served by Nginx - **Backend**: FastAPI - **Database**: PostgreSQL (stores sessions and configuration snapshots) This can be seen on the following diagram: .. image:: _static/arch_simple.png :alt: GUI Architecture Diagram :width: 50% :align: center :class: padded-image At a high level, the frontend: - loads a model/connector mode - accepts text input and optional audio - triggers explainability jobs - displays attribution visualizations - persists and reloads sessions The backend: - validates inputs and model state - runs inference and SHAP computation (via ``mllm_shap``) - exposes progress/cancel/logs for long-running jobs - stores sessions in the database Some examples of the running GUI are shown below: .. image:: _static/gui_screenshot.png :alt: GUI Example 1 :width: 70% :align: center :class: padded-image .. image:: _static/gui_screenshot_audio.png :alt: GUI Example 2 :width: 70% :align: center :class: padded-image Running the application (Docker Compose) --------------------------------------- CPU ^^^ From ``shap-mllm-explainer/``: .. code-block:: bash docker compose up --build Open the UI: - ``http://localhost`` GPU (optional) ^^^^^^^^^^^^^^ Prerequisites on the host: - NVIDIA driver installed and working (``nvidia-smi`` works) - NVIDIA Container Toolkit configured for Docker Run: .. code-block:: bash docker compose -f docker-compose.yaml -f docker-compose.gpu.yaml up --build Quick verification inside the backend container: .. code-block:: bash docker compose exec backend uv run python -c "import torch; print('cuda_available=', torch.cuda.is_available()); print('torch_cuda=', torch.version.cuda)" Local development (without Docker) ---------------------------------- Backend ^^^^^^^ From ``shap-mllm-explainer/backend/``: .. code-block:: bash uv sync --dev uv run uvicorn app.main:app --reload --host 0.0.0.0 --port 8000 Frontend ^^^^^^^^ From ``shap-mllm-explainer/web/``: .. code-block:: bash npm i npm run dev Configuration ------------- Frontend configuration ^^^^^^^^^^^^^^^^^^^^^^ - ``VITE_API_BASE_URL``: backend base URL (defaults to ``/api``) Backend configuration ^^^^^^^^^^^^^^^^^^^^^ - ``DATABASE_URL``: async SQLAlchemy URL (Compose uses ``postgresql+asyncpg://...``) - ``LOG_LEVEL``: log verbosity (e.g. ``INFO`` / ``DEBUG``) - ``UV_PYTHON``: set in Compose to prevent ``uv`` from downloading another CPython inside the container Compose files to inspect: - ``shap-mllm-explainer/docker-compose.yaml`` - ``shap-mllm-explainer/docker-compose.gpu.yaml`` Backend API surface (high level) -------------------------------- The backend exposes FastAPI routes under the ``/api`` prefix. General: - ``GET /health`` ML endpoints (``/api/ml``): - ``POST /models/load`` — load a supported connector/mode - ``POST /predict`` — run a prediction (text-only, or text+audio for LiquidAudio mode) - ``POST /explain`` — start an explainability job (supports text and optional audio) - ``GET /progress/{job_id}`` — progress reporting - ``POST /cancel/{job_id}`` — cooperative cancellation - ``GET /logs/{job_id}`` — retrieve the last log lines from a job - ``GET /telemetry`` — runtime telemetry (CPU/RAM/process + GPU if available) - ``GET /diagnostics`` — basic model/device diagnostics Notes: - Audio uploads are accepted only when the backend is running in LiquidAudio mode. - Text-only mode rejects audio requests with a clear 400 response.