Overview

• Built LangGraph workflow: 4-step pipeline for scene detection, RAG retrieval, safety assessment, and image editing. • Used GPT-4o / Gemini VLM to analyze home images, detecting room type, hazards, and safety features. • Built Hybrid RAG (FAISS + BM25) retrieving 105 curated CDC/HSSAT guidelines with risk-level citations. • Fused deterministic scoring rules with LLM reasoning to output 0-10 safety score and cited recommendations. • Generated safety improvement previews via Gemini Nano Banana showing recommended modifications. • Shipped FastAPI + React app with Docker deployment, supporting 11 scene types and real-time analysis.

Deep dive

Deep Dive — SafeLLM / Fall Risk Detection AI System (safellm3/safellm_deploy)

Scope note: this workspace contains multiple iterations (safellm/, safellm2/, safellm3/). The only directory with git history is safellm3/safellm_deploy/ (contains .git/), so this Deep Dive treats that as the “repo” for evidence and history.

1. What This Is (one paragraph)

SafeLLM is a deployable web app + API that takes a single photo of a home environment, classifies the scene into one of 11 fall‑risk categories, retrieves fall‑prevention guidelines from a small curated knowledge base, and returns a structured safety report (score, hazards, prioritized actions, cost/difficulty) plus an optional AI‑generated “visual improvements” image that overlays the recommended fixes. The repo also contains extracted guideline documents (e.g., CDC STEADI PDFs → markdown) under knowledge_base/processed/ for provenance/transparency.

2. Who It’s For + Use Cases

Primary users (as described in the repo docs):

Family caregivers assessing an elderly parent’s home for preventable fall hazards.
Clinicians / discharge planners doing a quick home safety pre‑screen.
Home modification services triaging what to fix first and estimating effort/cost.
Real estate / property managers evaluating accessibility and safety.

What “success” means (repo evidence + gaps):

Success (evidenced): system returns a structured report and can boot/build/test reliably (run_full_deploy_test.sh).
Success (inferred but not measured in repo): fewer missed hazards, fewer hallucinated hazards, actionable fixes, low latency, low cost. Unknown (not found in repo): defined product metrics (accuracy, NPS, retention, clinical outcomes). Suggested metrics are in §8.

3. Product Surface Area (Features)

A. End‑user web experience (React)

Upload photo → POST /assess (multipart file) from the UI (frontend/src/App.jsx:27).
Live “Analyzing…” UI while waiting (non‑streaming; one blocking request).
Structured results page:
- Score + risk level
- Hazard lists (critical/important/minor)
- Priority action plan
- Cost + difficulty
- Knowledge Base References section (shows which guidelines were retrieved and match %)

Claim	Evidence (file:line)
Backend listens on Cloud Run `PORT` if set, else `API_PORT` (default 8765)	`backend/api.py:34`
Backend serves built frontend `frontend/dist` at `/`	`backend/api.py:235`
Static assets are mounted when `frontend/dist` exists	`backend/api.py:147`, `backend/api.py:150`
CORS is currently wildcard (`allow_origins=["*"]`)	`backend/api.py:52`
Upload pipeline normalizes EXIF orientation and downscales large images	`backend/api.py:91`, `backend/api.py:94`
Old uploads/edited images are pruned (24h default)	`backend/api.py:60`, `backend/api.py:157`
`/assess` runs steps 0–3 sync and schedules step 4 async	`backend/api.py:290`, `backend/api.py:405`
Async edit jobs are tracked in memory via `JOBS` and exposed via `/edit_status`	`backend/api.py:154`, `backend/api.py:642`
Edited images are saved under `edited_images/` and served via `/edited_images/{id}_edited.png`	`backend/api.py:633`, `backend/workflow.py:1301`
Model/provider selection is driven by env vars and naming conventions	`backend/workflow.py:202–216`
Determinism toggle + seed exist	`backend/workflow.py:225–226`
Runtime retriever is initialized as `CuratedHybridRetriever()`	`backend/workflow.py:275`
Retrieval uses fixed `k=5` and a category-driven query	`backend/workflow.py:551–557`
Curated hybrid retrieval uses OpenAI embeddings + FAISS + BM25	`knowledge_base/curated_retrieval.py:32`, `knowledge_base/curated_retrieval.py:39`, `knowledge_base/curated_retrieval.py:68`
Hybrid score weights default to 0.6 vector / 0.4 BM25	`knowledge_base/curated_retrieval.py:191–192`
KB processor standardizes risk levels + hazard types	`knowledge_base/process_curated_knowledge.py:20`, `knowledge_base/process_curated_knowledge.py:23`
Safety assessment performs validation warnings (citations/length/hazard counts)	`backend/workflow.py:843–895`
Image editing can call OpenRouter image generation endpoint	`backend/workflow.py:1188`
Image editing can call OpenAI `client.images.edit`	`backend/workflow.py:1272`
Frontend calls `/assess` and displays results	`frontend/src/App.jsx:27`
Frontend polls `/edit_status/{image_id}` for async edited image	`frontend/src/components/Results.jsx:43`
`.env.template` documents required keys and model env vars (names only)	`.env.template:2–22`
Dockerfile is multi-stage: Node build → Python runtime, sets `PORT=8080`	`Dockerfile:3`, `Dockerfile:47`, `Dockerfile:54`
Deploy simulation script runs pytest + frontend build	`run_full_deploy_test.sh:48`, `run_full_deploy_test.sh:68`
Git history includes fixes for iPhone camera uploads and workflow startup robustness	`git log` in `safellm3/safellm_deploy/.git/` (e.g., commit `f1b6c43`)
Config includes CDC STEADI source URLs for document ingestion	`knowledge_base/config.py:112–118`
Docs claim images are “not stored permanently” (note: implementation writes to disk and prunes later)	`README.md:273`, `backend/api.py:60`

Claim

Evidence (file:line)

Backend listens on Cloud Run PORT if set, else API_PORT (default 8765)

backend/api.py:34

Backend serves built frontend frontend/dist at /

backend/api.py:235

Static assets are mounted when frontend/dist exists

backend/api.py:147, backend/api.py:150

CORS is currently wildcard (allow_origins=["*"])

backend/api.py:52

Upload pipeline normalizes EXIF orientation and downscales large images

backend/api.py:91, backend/api.py:94

Old uploads/edited images are pruned (24h default)

backend/api.py:60, backend/api.py:157

/assess runs steps 0–3 sync and schedules step 4 async

backend/api.py:290, backend/api.py:405

Async edit jobs are tracked in memory via JOBS and exposed via /edit_status

backend/api.py:154, backend/api.py:642

Edited images are saved under edited_images/ and served via /edited_images/{id}_edited.png

backend/api.py:633, backend/workflow.py:1301

Model/provider selection is driven by env vars and naming conventions

backend/workflow.py:202–216

Determinism toggle + seed exist

backend/workflow.py:225–226

Runtime retriever is initialized as CuratedHybridRetriever()

backend/workflow.py:275

Retrieval uses fixed k=5 and a category-driven query

backend/workflow.py:551–557

Curated hybrid retrieval uses OpenAI embeddings + FAISS + BM25

knowledge_base/curated_retrieval.py:32, knowledge_base/curated_retrieval.py:39, knowledge_base/curated_retrieval.py:68

Hybrid score weights default to 0.6 vector / 0.4 BM25

knowledge_base/curated_retrieval.py:191–192

KB processor standardizes risk levels + hazard types

knowledge_base/process_curated_knowledge.py:20, knowledge_base/process_curated_knowledge.py:23

Safety assessment performs validation warnings (citations/length/hazard counts)

backend/workflow.py:843–895

Image editing can call OpenRouter image generation endpoint

backend/workflow.py:1188

Image editing can call OpenAI client.images.edit

backend/workflow.py:1272

Frontend calls /assess and displays results

frontend/src/App.jsx:27

Frontend polls /edit_status/{image_id} for async edited image

frontend/src/components/Results.jsx:43

.env.template documents required keys and model env vars (names only)

.env.template:2–22

Dockerfile is multi-stage: Node build → Python runtime, sets PORT=8080

Dockerfile:3, Dockerfile:47, Dockerfile:54

Deploy simulation script runs pytest + frontend build

run_full_deploy_test.sh:48, run_full_deploy_test.sh:68

Git history includes fixes for iPhone camera uploads and workflow startup robustness

git log in safellm3/safellm_deploy/.git/ (e.g., commit f1b6c43)

Config includes CDC STEADI source URLs for document ingestion

knowledge_base/config.py:112–118

Docs claim images are “not stored permanently” (note: implementation writes to disk and prunes later)

README.md:273, backend/api.py:60

Overview

Deep dive

Deep Dive — SafeLLM / Fall Risk Detection AI System (safellm3/safellm_deploy)

1. What This Is (one paragraph)

2. Who It’s For + Use Cases

3. Product Surface Area (Features)

A. End‑user web experience (React)

B. Backend API (FastAPI)

C. Knowledge base tooling

D. Deployment/build tooling

4. Architecture Overview

A. Components (text diagram)

B. Repo inventory (top 2–3 levels, focus on runtime)

D. Key modules (what they do / why they matter)

C. Key runtime assumptions

5. Data Model

A. Runtime request state (in-memory)

B. Runtime files (local disk)

C. Curated Knowledge Base (static artifacts in repo)

D. “Raw/processed” source docs (transparency / provenance)

6. AI System Design (if applicable)

A. Knowledge ingestion & curation

B. Embeddings

C. Retrieval (vector + keyword hybrid)

D. Generation

Step 1 — Scene detection (LLM1)

Step 3 — Safety assessment (LLM2)

E. Visual feedback (image editing; Step 4)

F. Evaluation (what exists)

7. Reliability, Security, and Privacy

Reliability & correctness mechanisms

Security posture (current)

Privacy & data handling

8. Performance & Cost

What is evidenced

What is unknown (not measured in repo)

Suggested metrics to add (high leverage)

9. Hardest Problems + Key Tradeoffs

10. Operational Guide (Repro & Deploy)

A. Local setup (tested)

B. Required environment variables (names only)

C. Rebuilding the curated knowledge base (requires OpenAI embeddings)

D. Deployment

Option 1: Single-container deployment (backend serves built frontend)

Option 2: Split frontend/backend (traditional SPA hosting)

E. Debugging common failures

11. Evidence Map (repo anchors)

12. Interview Question Bank + Answer Outlines

System design

AI/RAG

Debugging & reliability

Product sense

Behavioral

13. Roadmap (high-leverage upgrades)

Must (production hardening)

Nice-to-have (quality + UX)