Zhiyuan Song
← Home

Study Assistant

Turn course packs and study materials into a searchable, generative, testable learning flow: multi-format ingestion and chunking, pgvector semantic search with cross-document links, OpenAI-backed translation/summary/glossary and flashcard expansion, with a planned practice exam mode (multi-type questions, scoring, progress) plus academic integrity features—source traceability, auditing, and permissions.

Implementation details, environment variables, and APIs follow the GitHub README and docs/; this page is a curated site summary covering architecture, data flow, data model, and phased roadmap (aligned with docs/ARCHITECTURE.md / ROADMAP.md; repository wins on conflicts).

Core features

Smart document processing

  • PDF, PPT, TXT, and more
  • Automatic chunking and extraction
  • OCR for scanned pages
  • Tables and equations (improves over iterations)

AI content generation

  • Contextual translation from selected spans
  • Auto study summaries
  • Glossary and flashcards
  • Context-aware knowledge expansion

Vector semantic search

  • PostgreSQL pgvector semantic retrieval
  • Cross-document linking and personalized recommendations

Practice exams (planned / in development)

  • Question generation from study materials (MCQ, fill-in, short answer, etc.)
  • Auto grading, feedback, and progress analytics

Academic integrity & compliance

  • Source traceability
  • Integrity checks and audit logs
  • Permissions and access control

Quick start

Requirements: Node.js 18+; PostgreSQL 15+ with pgvector; Redis 6+; object storage (e.g. MinIO, S3-compatible); optional PM2 for process management.

1. Install

git clone https://github.com/songzhiyuan98/study-assistant.git
cd study-assistant
pnpm install   # If the repo pins npm, follow README instead

2. Configure environment

cp packages/db/.env.example packages/db/.env
# Edit packages/db/.env

Common variables: DATABASE_URL (PostgreSQL), OPENAI_API_KEY, NEXTAUTH_SECRET, etc.; full list in the repo examples.

3. Database

npm run docker:up      # PostgreSQL, Redis, MinIO, etc. (see package.json scripts)
npm run db:migrate
npm run db:seed

4. Dev servers

npm run dev

Open http://localhost:3000 (port per local config).

User workflow (product)

  1. Register (email or Google OAuth)
  2. Create folders to organize documents by course or topic
  3. Upload PDF / PPT / TXT
  4. Select passages to study inside a document
  5. Generate translations, summaries, flashcards, and other AI artifacts
  6. (Roadmap) Build practice exams from materials and track progress

Feature progress

Snapshot timestamp 2025-09-08; if it conflicts with ROADMAP.md, follow the repository.

Snapshot (2025-09-08)

Done:

  • Database foundation: PostgreSQL + pgvector + Prisma
  • Auth: NextAuth.js (credentials / OAuth) with middleware protection
  • Folder organization: per-user document taxonomy
  • Uploads: MinIO storage + API integration (including bugfix loops)

In progress: PDF / TXT parsing pipeline—content extraction and chunking.

Completion (planning view): phase 1 ~67% (4/6 major tasks); overall ~22% (4/18 major tasks).

Capability matrix (short)

Feature Status Notes
Folder management Done Multi-level document organization
User authentication Done Email / Google OAuth
PDF processing In development Text extraction and chunking
AI translation Planned Multilingual
Exam system Planned Auto generation and grading

Tech architecture

Frontend

  • Next.js 14 + React 18
  • TypeScript
  • Tailwind CSS
  • Zustand + React Query
  • NextAuth.js

Backend & data

  • API: Fastify
  • Database: PostgreSQL + pgvector (IVFFLAT or other index strategies)
  • ORM: Prisma
  • Queues: BullMQ + Redis
  • Object storage: MinIO / S3-compatible

AI

  • Models: OpenAI GPT-3.5 / 4 family
  • Embeddings: text-embedding-ada-002 (or later per config)
  • Retrieval: pgvector

System architecture

Overview

Study Assistant uses a modern modular monorepo: web and API are decoupled, heavy jobs run on queues + workers, files land in object storage, structured and vector data live in PostgreSQL, and long-running inference/embeddings call external AI. The shape resembles microservice collaboration; deployment can scale horizontally by process or container.

User → Web app → API (Fastify) → PostgreSQL (business + vectors)
                    │
                    ├→ Redis queue (BullMQ) → background workers (ingest / generate / export)
                    │
                    ├→ MinIO / S3 (raw files and exports)
                    │
                    └→ OpenAI, etc. (embeddings + generation)

Data flow

1. Upload & processing

  1. User uploads → web client
  2. Client calls API → object storage (MinIO / S3)
  3. API writes business rows → enqueue ingestion job
  4. Worker: parse / chunk → embeddings → persist to DB
  5. Update lecture / segment processing state for polling or push UI

2. Generation & learning

  1. User selects text in reader → create Selection record
  2. Generation request → enqueue AI job (context + source refs)
  3. Worker calls OpenAI → translation / summary / flashcards, etc.
  4. Persist results (payloadJson, sourceRefs) → notify or poll UI

Data model

Core relationships (conceptual)

USER
├── FOLDER
│   └── LECTURE (document)
│       └── SEGMENT (chunk + embedding vector)
├── SELECTION (user highlight of segments)
│   └── ITEM (generated artifact: translation, summary, etc.)
└── EXAM (roadmap)
    └── EXAM_ATTEMPT (responses)

Key fields (summary)

Entity Highlights
User id, email, name, optional password, role (RBAC)
Folder id, name, description, userId
Lecture id, folderId, title, fileUrl, type, processing status
Segment id, lectureId, body, embedding vector, page / anchor
Selection id, userId, lectureId, segmentIds[]
Item id, selectionId, type, payloadJson, sourceRefs

Component boundaries

Frontend (Next.js App Router)

app/web/src/
├── app/
│   ├── (auth)/           # sign-in / sign-up
│   ├── dashboard/
│   ├── upload/
│   ├── library/
│   └── api/              # Next.js route handlers (if any)
├── components/
│   ├── ui/
│   ├── forms/
│   └── layout/
├── lib/
└── providers/

Backend & workers

app/
├── api/                  # Fastify: routes, plugins, middleware
├── workers/              # BullMQ: ingest, generate, export, etc.
└── sidecar/              # optional: Python OCR or other standalone services

Vector retrieval pipeline

Semantic search flow

  1. Query: user text → embedding (e.g. OpenAI ada-002 family)
  2. Retrieve: pgvector + IVFFlat (or other index) for approximate nearest neighbors
  3. Rank: cosine similarity → top-K segments
  4. Post-process: dedupe, filter, permission checks → return spans

Parameters & policy (target)

  • Index: IVFFlat; distance: cosine
  • Dimension: 1536 (ada-002; change with model)
  • Similarity threshold: e.g. ≥ 0.8 (tunable)
  • Default top: ~10 related segments

RBAC & audit

RBAC roles (target)

  • ADMIN: system and user administration, audit logs
  • INSTRUCTOR: course content management, student progress
  • STUDENT: access to personal documents and generated artifacts

Audit & compliance

  • Log key CRUD and AI calls (searchable, exportable)
  • Retention: e.g. 6 months active + 2 years archive (per final implementation)
  • Academic integrity reports, personal data access logs (privacy)

Async queues (BullMQ)

Task types (example names)

  • document:ingest — document ingestion & parsing
  • content:generate — AI artifact generation
  • export:data — exports
  • cleanup:files — cleanup
  • vector:index — vector index maintenance

Priority & concurrency

  • High: user-triggered generation
  • Medium: file ingestion and chunking
  • Low: cleanup and maintenance
  • Independent concurrency caps per job type to avoid starvation

Observability, SLOs, and API

Target KPIs (design)

  • API latency: P95 < 200 ms for light endpoints; first paint < 3 s
  • Document ingestion: target < 30 s per document (size-dependent)
  • AI: GPT-class interactions ~< 5 s; embeddings ~< 2 s
  • Concurrency: 100+ simultaneous users after horizontal scale
  • Availability target: 99.9% SLA (production phase)

Monitoring stack (planned)

  • Tracing / metrics: OpenTelemetry
  • Time series: Prometheus; dashboards: Grafana
  • Logs: ELK or managed equivalent
  • Alerts: PagerDuty or existing on-call tooling

API principles

  • RESTful; versioned paths like /api/v1/
  • Uniform JSON responses and error codes; auth: JWT Bearer

Key endpoints (summary)

POST /api/auth/register
POST /api/auth/login
GET  /api/auth/session

POST /api/lectures
GET  /api/lectures/:id
GET  /api/folders

POST /api/selections
POST /api/items/generate
GET  /api/items/:id

Repository layout (summary)

study-assistant/
├── app/
│   ├── web/          # Next.js frontend
│   ├── api/          # Fastify API
│   └── workers/      # background workers
├── packages/
│   ├── db/           # Prisma + migrations
│   ├── shared/
│   └── ui/
├── docs/             # ROADMAP, ARCHITECTURE, OPERATIONS, etc.
└── CLAUDE.md         # AI collaboration notes (if kept)

Roadmap

Weeks are planning cadence; checkboxes follow the latest commits and ROADMAP.

Phase 1: foundation (~weeks 1–2)

Goal: core document pipeline + basic generation loop.

Week 1 (infrastructure)

  • Monorepo setup (done)
  • PostgreSQL + pgvector (done, 2025-09-08)
  • NextAuth baseline (done, 2025-09-08)
  • File upload + MinIO / S3 (done, 2025-09-08)
  • Current focus: PDF / TXT parsing (Node.js)
  • Chunking and anchor model

Week 2 (selection & generation)

  • PDF.js reader, text selection, segment management
  • BullMQ queue wiring
  • Baseline AI translation and summaries
  • Embeddings + similarity retrieval, review UI for outputs
  • Folder organization (done, 2025-09-08)

Phase 2: feature depth (~weeks 3–4)

Goal: advanced document capabilities + exam MVP.

Week 3: PPTX, OCR (Python sidecar), two-column paper layout, tables / equations, vector sidebar, exports (Markdown, CSV, Anki, etc.).

Week 4: exam blueprints, evidence-based item generation, player with timers, auto grading + rubrics, performance analysis and wrong-answer coaching.

Phase 3: production readiness (~weeks 5–6)

Goal: security, performance, deployment.

Week 5: academic integrity policy, end-to-end auditing, RBAC, cost/token budgets, encryption/privacy, compliance reporting.

Week 6: OpenTelemetry, caching/perf, Docker/K8s manifests, CI/CD, error tracking/alerts, production deployment templates.

Next actions (excerpt)

  • Implement PDF text extraction (PDF.js or pdf-parse, etc.)
  • PPTX parsing approach (dedicated library / server conversion)
  • TXT encoding detection and reads
  • Chunking strategy and anchor model design
  • Preprocessing orchestration aligned with queue job taxonomy

Update this curated page after each major delivery to reflect progress and schedule shifts.

Common commands

npm run dev          # local dev
npm run build        # production build
npm run test         # tests

npm run db:migrate   # migrations
npm run db:seed      # seed data
npm run db:studio    # Prisma Studio

npm run lint
npm run type-check
npm run format

Script names follow root package.json.

Contributing & license

Fork → feature branch → commits → push → pull request; conventions in CONTRIBUTING.md.

Docs: docs/ROADMAP.md, docs/ARCHITECTURE.md, docs/OPERATIONS.md, API notes, CHANGELOG, CLAUDE guide, etc.

License: MIT.

Thanks to Next.js, Prisma, pgvector, OpenAI, and other OSS/providers.