Knowledge Graphs

Build a unified academic research system that integrates researchers, papers, institutions, and topics in a single database. Graph traversal drives multi-hop queries across co-authorship and citation networks, vector similarity enables semantic paper search via vectorNeighbors() on paper embeddings, full-text search supports keyword queries on abstracts using SEARCH_INDEX(), and time-series tracking monitors citation activity over time.

Architecture Overview

Vertices

Researcher, Paper, Topic, Institution

Edges

CO_AUTHORED, CITES, COVERS, AFFILIATED_WITH

Documents

PaperActivity (paperId, citationCount, timestamp)

Papers carry 4-dimensional embedding vectors for semantic search and full-text indexed abstracts for keyword search. Citation activity is tracked as time-series documents.

Key Queries

Co-authorship Network — Discover collaborations between researchers:

MATCH (r:Researcher)-[:CO_AUTHORED]->(p:Paper)<-[:CO_AUTHORED]-(coauthor:Researcher)
WHERE r.name = 'Dr. Smith'
RETURN coauthor.name, collect(p.title) AS papers

Semantic Paper Search — Find papers similar to a research topic by embedding:

SELECT title, year, distance FROM (
  SELECT expand(vectorNeighbors('Paper[embedding]', [0.8, 0.3, 0.7, 0.1], 5))
) ORDER BY distance

Full-Text Abstract Search — Keyword search across paper abstracts:

SELECT title, abstract FROM Paper
WHERE SEARCH_INDEX('Paper[abstract]', 'machine learning graph')

Hybrid Semantic + Keyword Search, one query — Available since v26.5.1. Fuse the semantic-similarity ranking with the abstract keyword match into a single ranked list, then collapse to one paper per topic so the literature review surface does not over-represent a single research thread:

SELECT expand(`vector.fuse`(
    `vector.neighbors`('Paper[embedding]', [0.8, 0.3, 0.7, 0.1], 50),
    (SELECT @rid, $score FROM Paper
     WHERE SEARCH_INDEX('Paper[abstract]', 'machine learning graph') = true),
    { fusion: 'RRF', groupBy: 'topic', groupSize: 1 }
)) LIMIT 10

The semantic side surfaces conceptually similar papers even when the abstract phrasing diverges; the full-text side anchors the result on papers that explicitly use the query terms; RRF balances the two without requiring score calibration. Replace groupBy: 'topic' with 'institution', 'venue', or 'year' to diversify along a different axis. To dedupe across multiple authors of the same paper, the upstream sparse vector index can be added as a third source for term-level disambiguation.

Try It Yourself

git clone https://github.com/ArcadeData/arcadedb-usecases.git
cd arcadedb-usecases/knowledge-graphs
docker compose up -d
./setup.sh
./queries/queries.sh