Genie

About

This is a demo of a semantic search engine for scientific papers. It uses a neural network to embed papers into a vector space, and then uses a vector similarity search engine to find similar papers.

The neural network is trained on the arXiv dataset, which contains 1.7 million papers in the fields of physics, mathematics, computer science, and more.

The neural network is trained to predict the next word in a paper's abstract, given the previous words. The embeddings are the hidden state of the neural network, which is a vector representation of the paper's abstract.

Thank you to arXiv for use of its open access interoperability