Case Study

Graphwise

Code knowledge graph platform with GraphRAG: processes repositories, models code relationships in Neo4j, and enables natural-language code search and Q&A.

Live Demo GitHub Repo

Solo Developer

Feb 2025 – Mar 2025

React

Vite

TypeScript

Tailwind CSS

FastAPI

Neo4j

PostgreSQL

Redis

Docker

Caddy

GitHub Actions

Azure

Visuals

Problem

Understanding a large, unfamiliar codebase is one of the most time-consuming parts of software development. Traditional keyword search misses semantic relationships, you can find a function by name but not by what it connects to. Existing tools either require expensive embeddings for every file or produce flat, context-free results.

Solution

Built a Graph RAG pipeline that parses a repository's AST, extracts entities (functions, classes, imports, modules) and their relationships, and stores them as a property graph in Neo4j. At query time, natural language questions are converted to Cypher traversals, retrieving structurally relevant context before passing it to an LLM, giving answers grounded in actual code topology, not just text similarity.

Impact

Natural-language code search grounded in actual code relationships, not just embeddings
CI/CD pipeline with GitHub Actions. Push to main automatically rebuilds and redeploys

What I Learned

Graph-based retrieval and vector-based retrieval solve different problems. Graph is better for 'what calls what', vectors for 'what does this concept mean'; combining both gives the best results
Neo4j Cypher generation from natural language is the hardest part. LLMs hallucinate invalid property names; adding a schema-awareness step before generation reduced errors significantly
Caddy was a much simpler reverse proxy choice than Nginx for this scale. Automatic HTTPS with one line of config