RAG mode
RAG searches the indexed thesis chunks, optionally reranks the matches, and answers
with citations from the retrieved sources. This is usually the best mode for broad
questions across many theses. It is limited by retrieval quality: if the wrong chunks
are selected, the answer can miss relevant details.
LLM mode
LLM mode works with one selected thesis and sends its extracted document text as
context for focused follow-up questions. This is useful when you already know the
relevant thesis. It can be slower or less reliable for very long documents because the
prompt becomes large.
Provider
Morpheus is the default because it is free and privacy-preserving, but it can be
slower. OpenAI is usually faster and probably better for answer quality, but it costs
money and sends prompts to OpenAI.
Reranker
The BGE reranker re-scores candidate chunks after Qdrant retrieval, which can improve
source relevance and citations. It is disabled by default because it is damn expensive
on CPU and can dominate response time. Turn it on when precision matters more than
speed.
Live progress
The chat UI receives live backend events while a question is processed. These messages
show whether the backend is embedding the query, searching Qdrant, reranking chunks,
or waiting for the answer model.
Scope
Edit Scope lets you remove or add thesis documents for the current conversation
focus. After changing the scope, rerun the last question with the circular arrow.
Input
Press Enter to send a question. Press Shift+Enter for a line break. New Chat clears
the current browser conversation state.