Setting Up ChromaDB

Introduction

ChromaDB is an open-source vector embedding database designed for AI-powered applications such as semantic search, Retrieval-Augmented Generation (RAG), AI chatbots, and Large Language Model (LLM) integrations. It enables efficient storage and retrieval of vector embeddings generated from textual or multimedia data.

This deployment was implemented on a NeevCloud CPU Cloud environment using Ubuntu 22.04 Server. The ChromaDB instance was configured as the vector database backend for a future integration with AnythingLLM.

The purpose of this deployment was to create a scalable vector storage infrastructure capable of supporting AI document retrieval and contextual search operations.

The overall implementation was divided into two major phases:

Phase 1: Deployment and configuration of ChromaDB as the vector database backend.

Phase 2: Deployment of AnythingLLM and integration with the existing ChromaDB instance to enable Retrieval-Augmented Generation (RAG), semantic document search, and AI-powered document interaction.

Prerequisites

  • Neevcloud CPU Cloud server (Ubuntu 22.04)

  • SSH access to your server

  • Basic knowledge of terminal commands

Server Specifications (Your Setup)

  • IP Address: 103.192.198.154

  • RAM: 8 GB

  • vCPU: 4 cores

  • OS: Ubuntu 22.04

  • Region: Central India


Step-by-Step Installation Guide

Step 1: Connect to Your Server After establishing SSH connectivity with the server, the environment was prepared for software installation and deployment activities.

Step 2: Update System Packages Updating system packages ensures that all dependencies and security patches are up to date before deployment.

Step 3: Install Python and pip The required Python environment was successfully prepared for running the ChromaDB server application. ChromaDB requires Python 3.8 or higher.

Verify installation:

Step 4: Create a Dedicated User (Optional but Recommended) Creating a dedicated user improves security and helps isolate the deployment environment from the root account.

Step 5: Create Project Directory The project directory was created to organize deployment files and application data efficiently.

Step 6: Set Up Python Virtual Environment

The virtual environment ensures dependency isolation and prevents conflicts with system-level Python packages.

Step 7: Install ChromaDB

After successful package installation, the ChromaDB server was ready for deployment and API configuration.

For the server version with HTTP API:

This installs:

  • ChromaDB server

  • HTTP API support

  • Required dependencies

Step 8: Create ChromaDB Configuration

The server was started in external access mode to allow API communication through the public IP address.

Create a configuration file:

Add the following configuration:

yaml

Step 9: Start ChromaDB Server

Run ChromaDB in server mode:

For production, you might want to specify additional parameters:

Step 10: Set Up ChromaDB as a System Service

Create a systemd service file:

Add the following content:

Step 11: Enable and Start the Service

Step 12: Configure Firewall

Step 13: Verify ChromaDB is Running Firewall rules were updated to permit external connectivity on the required service port.

Check if ChromaDB is responding:

Expected response:

This confirms:

  • ChromaDB is running

  • API is accessible

  • External connectivity works successfully

Conclusion

ChromaDB was successfully deployed on a NeevCloud Ubuntu server using a Python virtual environment. The API was verified using the heartbeat endpoint and external connectivity was confirmed through the public IP address.

Last updated