# Setting Up ChromaDB

**Introduction**

ChromaDB is an open-source vector embedding database designed for AI-powered applications such as semantic search, Retrieval-Augmented Generation (RAG), AI chatbots, and Large Language Model (LLM) integrations. It enables efficient storage and retrieval of vector embeddings generated from textual or multimedia data.

This deployment was implemented on a NeevCloud CPU Cloud environment using Ubuntu 22.04 Server. The ChromaDB instance was configured as the vector database backend for a future integration with AnythingLLM.

The purpose of this deployment was to create a scalable vector storage infrastructure capable of supporting AI document retrieval and contextual search operations.

The overall implementation was divided into two major phases:

**Phase 1:** Deployment and configuration of ChromaDB as the vector database backend.

**Phase 2:** Deployment of AnythingLLM and integration with the existing ChromaDB instance to enable Retrieval-Augmented Generation (RAG), semantic document search, and AI-powered document interaction.

**Prerequisites**

* Neevcloud CPU Cloud server (Ubuntu 22.04)
* SSH access to your server
* Basic knowledge of terminal commands

**Server Specifications (Your Setup)**

* **IP Address**: 103.192.198.154
* **RAM**: 8 GB
* **vCPU**: 4 cores
* **OS**: Ubuntu 22.04
* **Region**: Central India

***

#### Step-by-Step Installation Guide

**Step 1: Connect to Your Server**\
After establishing SSH connectivity with the server, the environment was prepared for software installation and deployment activities.

```bash
ssh root@< IP_address >
```

**Step 2: Update System Packages**\
Updating system packages ensures that all dependencies and security patches are up to date before deployment.

```bash
sudo apt update && sudo apt upgrade -y
```

**Step 3: Install Python and pip**\
The required Python environment was successfully prepared for running the ChromaDB server application.\
ChromaDB requires Python 3.8 or higher.

```bash
sudo apt install python3 python3-pip python3-venv -y
```

Verify installation:

```bash
python3 --version
pip3 --version
```

<figure><img src="/files/N6KZptvwUNCGQxYQZnTg" alt=""><figcaption></figcaption></figure>

\
**Step 4: Create a Dedicated User (Optional but Recommended)**\
Creating a dedicated user improves security and helps isolate the deployment environment from the root account.

```bash
sudo adduser chromadb
sudo usermod -aG sudo chromadb
su - chromadb
```

**Step 5: Create Project Directory**\
The project directory was created to organize deployment files and application data efficiently.

```bash
mkdir -p ~/chromadb-server
cd ~/chromadb-server
```

**Step 6: Set Up Python Virtual Environment**

The virtual environment ensures dependency isolation and prevents conflicts with system-level Python packages.

```bash
python3 -m venv venv
source venv/bin/activate
```

**Step 7: Install ChromaDB**

After successful package installation, the ChromaDB server was ready for deployment and API configuration.

```bash
pip install chromadb
```

For the server version with HTTP API:

```bash
pip install chromadb[server]
```

This installs:

* ChromaDB server
* HTTP API support
* Required dependencies

**Step 8: Create ChromaDB Configuration**

The server was started in external access mode to allow API communication through the public IP address.

Create a configuration file:

```bash
nano chroma-config.yaml
```

Add the following configuration:

yaml

```yaml
is_persistent: true
persist_directory: ./chroma_data
anonymized_telemetry: false
allow_reset: true
```

**Step 9: Start ChromaDB Server**

Run ChromaDB in server mode:

```bash
chroma run --host 0.0.0.0 --port 8000 --path ./chroma_data
```

For production, you might want to specify additional parameters:

```bash
chroma run --host 0.0.0.0 --port 8000 --path ./chroma_data --log-config-path ./log_config.yaml
```

<figure><img src="/files/JjS30YYOP09dSGdIMuDy" alt=""><figcaption></figcaption></figure>

**Step 10: Set Up ChromaDB as a System Service**

Create a systemd service file:

```bash
sudo nano /etc/systemd/system/chromadb.service
```

Add the following content:

```ini
[Unit]
Description=ChromaDB Vector Database
After=network.target

[Service]
Type=simple
User=chromadb
WorkingDirectory=/home/chromadb/chromadb-server
Environment="PATH=/home/chromadb/chromadb-server/venv/bin"
ExecStart=/home/chromadb/chromadb-server/venv/bin/chroma run --host 0.0.0.0 --port 8000 --path /home/chromadb/chromadb-server/chroma_data
Restart=always
RestartSec=10

[Install]
WantedBy=multi-user.target
```

**Step 11: Enable and Start the Service**

```bash
sudo systemctl daemon-reload
sudo systemctl enable chromadb
sudo systemctl start chromadb
sudo systemctl status chromadb
```

**Step 12: Configure Firewall**

```bash
sudo ufw allow 8000/tcp
sudo ufw status
```

**Step 13: Verify ChromaDB is Running**\
Firewall rules were updated to permit external connectivity on the required service port.

Check if ChromaDB is responding:

```bash
http://<ip_address>:8000/api/v2/heartbeat
```

Expected response:

```
{"nanosecond heartbeat": "value"}
```

This confirms:

* ChromaDB is running
* API is accessible
* External connectivity works successfully<br>

<figure><img src="/files/PLByyUGyi1w1wLq1ZA13" alt=""><figcaption></figcaption></figure>

**Conclusion**

ChromaDB was successfully deployed on a NeevCloud Ubuntu server using a Python virtual environment. The API was verified using the heartbeat endpoint and external connectivity was confirmed through the public IP address.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.neevcloud.com/neevcloud-guide/neevcloud-knowledgebase/setting-up-chromadb.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
