Guides

Document chat & embeddings

Athenium turns every uploaded resource into something students can interrogate — ask a question, get a cited answer, drill in further. This guide explains what gets indexed, how to ask good questions, and how the underlying pipeline works.

What gets indexed

Anything a professor uploads to a classroom's Resources tab is eligible for chat — PDFs, DOCX, PPT, and Markdown notes. Athenium extracts the text server-side via the /api/extract-content endpoint, which uses pdf-parse for PDFs, mammoth for DOCX, and office-text-extractor for the rest.

Extracted text is passed into the chat as context — there are no persistent embeddings yet, just per-question retrieval over the active document.

Document size cap
Per-message context is currently capped at 15,000 characters of extracted text. Long documents are still useful, but you'll get the best results by chatting with one chapter or unit at a time.

Starting a chat

Open any resource and click Ask. The chat panel slides in and you can type a question. Responses stream in token-by-token — you can start reading before the model is done writing.

What makes a good question

  • Be specific. “Explain Chapter 3” will get you a paraphrase of the chapter. “Compare TCP vs UDP in three lines” gets you a useful study aid.
  • Ask for structure. “List the four steps” or “summarize as a bullet list” works well — the system prompt encourages markdown formatting.
  • Cite back. Every answer ends with a source line referencing a page or section. Click through to verify before relying on it for an exam.

Source attribution

Every response is required to end with a source line:

> 📚 Source: [Page 12] Under Transport Layer: "TCP is connection-oriented..."

This is enforced by the system prompt and is non-negotiable — if the model can't cite, it will say so rather than fabricate.

Streaming and the API

Chat is implemented in src/lib/utils/gemini.ts using the @langchain/google-genai client. The exported generateWithGeminiStream generator yields chunks of text as they arrive, which the React UI consumes via async iteration:

for await (const chunk of generateWithGeminiStream(question, context)) {
  setAnswer((prev) => prev + chunk);
}

Word-boundary buffering happens inside the generator so the rendered text reads smoothly rather than character-by-character.

Privacy and the public key

Document chat runs entirely on the client using the NEXT_PUBLIC_GOOGLE_AI_API_KEY environment variable. This is intentional — the key is meant to be public-bundled. Server-only endpoints (like AI assignment generation) use a separate, server-only key.

Don't paste secrets
Treat the document chat as you would a public LLM. Any text sent in a question is forwarded to Google's API. Don't paste student PII or sealed exam content.