1.3 KiB
1.3 KiB
Usage
Running Server
# Locally
minyma server run
# Docker Quick Start
make docker_build_local
docker run \
-p 5000:5000 \
-e OPENAI_API_KEY=`cat openai_key` \
-e DATA_PATH=/data \
-v ./data:/data \
minyma:latest
The server will now be accessible at http://localhost:5000
Normalizing & Loading Data
Minyma is designed to be extensible. You can add normalizers and vector db's
using the appropriate interfaces defined in ./minyma/normalizer.py
and
./minyma/vdb.py
. At the moment the only supported database is chroma
and the only supported normalizer is the pubmed
normalizer.
To normalize data, you can use Minyma's normalize
CLI command:
minyma normalize --filename ./pubmed_manuscripts.jsonl --normalizer pubmed --database chroma --datapath ./chroma
The above example does the following:
- Uses the
pubmed
normalizer - Normalizes the
./pubmed_manuscripts.jsonl
raw dataset [0] - Loads the output into a
chroma
database and persists the data to the./chroma
directory
NOTE: The above dataset took about an hour to normalize on my MPB M2 Max
[0] https://huggingface.co/datasets/TaylorAI/pubmed_author_manuscripts/tree/main
Development
# Initiate
python3 -m venv venv
. ./venv/bin/activate
# Local Development
pip install -e .
# Creds
export OPENAI_API_KEY=`cat openai_key`