Turn business data into queryable insights with natural language + graph visualization.
GraphQuery is a full-stack system that accepts natural language questions, uses LLMs to generate SQL against PostgreSQL (Supabase), executes the query on real relational data, and transforms results into an interactive graph in the UI.
- Live App: : https://dodge-wheat-eight.vercel.app/ (deploy: Vercel + Render + Supabase)
Operational business data (orders, invoices, deliveries, payments, customers, products) is typically stored across multiple normalized tables. This creates two common challenges:
- Fragmented visibility: understanding end-to-end flow requires multiple joins and technical SQL knowledge.
- Slow investigation cycles: business users cannot quickly ask ad-hoc questions without analyst support.
A graph-based abstraction helps users understand entity relationships and flow progression faster than raw table output alone.
Dodge AI provides a natural language interface for operational analytics:
- User asks a question in chat.
- Groq generates SQL grounded in the live PostgreSQL schema.
- Backend validates/sanitizes SQL and executes it.
- Result rows are transformed into graph nodes and edges.
- UI renders summary + table + interactive graph.
Pipeline: Natural Language → SQL → PostgreSQL → Graph Model → React UI
- Chat-first interface for submitting questions
- Handles
graph,empty, anderrorAPI response types - Renders graph using React Flow (
@xyflow/react) - Displays tabular query result preview alongside graph interactions
/queryendpoint for natural-language query processing/query/streamendpoint for server-sent progressive status + final payload- LLM orchestration (Groq SQL generation, Gemini summarization with fallback)
- SQL guardrails and schema-aware validation
- Graph builder for dynamic node/edge generation
- Source of truth for business entities
- Connection pooling via
psycopg_pool - Indexed for core query paths (orders/customer/order-flow joins)
- Groq: SQL generation from schema-grounded prompt
- Gemini: business summary generation for larger result sets
- Fallback: summary fallback to Groq or deterministic formatter when needed
### End-to-End Flow
User → FastAPI → Groq (SQL) → SQL Validation → PostgreSQL → Graph Builder → Gemini/Groq Summary → React UI
- React
- Vite
@xyflow/react(React Flow)- Axios
- D3-force
- FastAPI
- Uvicorn
- Pydantic
- Psycopg + Psycopg Pool
- PostgreSQL (Supabase)
- Groq API
- Gemini API
- Natural language querying over business dataset
- Dynamic SQL generation (no hardcoded query templates as primary path)
- Graph visualization from real query output (
nodes[],edges[]) - Multi-entity relationship mapping:
- customer, order, delivery, invoice, payment, product
- Structured API responses:
type: "graph"type: "empty"type: "error"
- Guardrails for dataset-only querying
- Timeout handling, fallback behavior, and error-safe responses
Show all ordersTrace order flowTop productsUnpaid invoicesShow customer 320000083 orders
customersordersorder_itemsdeliveriesinvoicespaymentsproducts
orders.customer_id -> customers.customer_iddeliveries.order_id -> orders.order_idinvoices.order_id -> orders.order_idpayments.customer_id -> customers.customer_idorder_items.order_id -> orders.order_idorder_items.product_id -> products.product_id
- Better concurrency and stability for multi-request API workloads
- Stronger indexing and query planning for relational analytics
- Production-ready connection pooling and deployment ergonomics
- Better fit for Supabase-managed cloud deployment
- Groq receives:
- live database schema (tables + columns + foreign keys)
- strict prompt rules (SELECT/WITH only, explicit aliases, row limits)
- Expected model output format: JSON
{ "sql": "..." }
- Backend loads schema from PostgreSQL and injects it into the prompt.
- SQL is sanitized and then validated against known schema identifiers.
- Reject non-SELECT SQL and dangerous keywords (
DROP,DELETE, etc.) - Enforce limits (
LIMIT 20default if missing) - Validate aliases and identifier consistency
- Return structured error when SQL fails validation
- Primary summarization: Gemini (for larger result sets)
- Fallback: Groq summary
- Final fallback: deterministic local text formatter
customerorderdeliveryinvoicepaymentproduct
- customer → order (
placed) - order → delivery (
fulfilled by) - delivery/order → invoice (
billed via) - invoice → payment (
paid by) - order → product (
contains)
Graph representation makes cross-entity dependency and flow-state analysis much easier than reading flat joined rows, especially for traceability questions.
Dodge AI is intentionally scoped to the business dataset domain.
- Non-dataset questions are rejected.
- Standard guardrail response example:
"This system only answers dataset-related queries."
This reduces irrelevant model behavior and keeps results aligned with assignment scope.
cd backend
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -r requirements.txt
uvicorn main:app --reload --port 8000cd frontend
npm install
npm run devCreate backend/.env:
DATABASE_URL=postgresql://<user>:<password>@<host>:<port>/<db>
GROQ_API_KEY=<your_groq_key>
GEMINI_API_KEY=<your_gemini_key>Optional frontend env (frontend/.env):
VITE_API_BASE_URL=http://localhost:8000Recommended split deployment:
- Backend: Render
- Database: Supabase (PostgreSQL)
- Frontend: Vercel
Deployment checklist:
- Set all environment variables in hosting dashboard
- Ensure CORS is configured for frontend domain
- Verify health endpoint and
/queryresponses post-deploy
- SQLite → PostgreSQL migration: moved to production-grade connection pooling and schema loading.
- LLM incorrect columns/aliases: added schema-grounded prompts + validation/sanitization guardrails.
- Graph not rendering reliably: enforced stable node/edge contracts and frontend rerender reset logic.
- Intermittent network failures: improved timeout handling and robust API error-state flows.
- Better graph UX (entity filtering, edge legends, clustering)
- Query result caching with TTL + invalidation strategy
- Explainable SQL mode (show why query was generated)
- Expanded analytics templates and KPI dashboarding
- Role-aware access controls and audit logging
AI tools were used as development accelerators (e.g., Copilot/Cursor-style assistance) for:
- prompt engineering iterations
- debugging assistance
- refactoring support
- documentation polishing
All generated suggestions were manually reviewed and adapted to match runtime behavior and project requirements.
Dodge AI demonstrates a practical architecture for combining LLM-driven SQL generation with graph-based data interpretation in a production-oriented full-stack application. It improves data accessibility for business users while preserving backend guardrails, structured error handling, and stable visualization flow.

