TECH 3 min read

Deploying RAG API Server with Railway in Just 5 Minutes Without Docker

A guide to deploying a RAG system for quant research on Railway. It automatically deploys on GitHub push, and costs are much lower than Heroku.

Deploying RAG API Server with Railway in Just 5 Minutes Without Docker

Why Is Deployment a Hassle?

Railway Cloud Deployment Infrastructure

I built a RAG system locally. Combining LangChain + Qdrant + Claude API, it performs well on financial document searches. But to share with my team, I need to host it on a server.

Typical options are:

  1. SSH into EC2 and set up manually — cumbersome and hard to manage
  2. Docker + ECS/Kubernetes — overly complex
  3. Heroku — expensive (no free plan anymore)
  4. Railway — I chose this

Why Railway Is Convenient

Railway connects to your GitHub repo and automatically builds and deploys on main branch pushes. If you have a Dockerfile, it uses that; otherwise, it detects the language and builds automatically.

For a Python FastAPI project:

  1. Push code to GitHub
  2. In Railway, select “Deploy from GitHub”
  3. Choose the repository
  4. Done

Environment variables are set via the Railway dashboard. No need to hardcode sensitive info like ANTHROPIC_API_KEY, QDRANT_URL.


Example FastAPI RAG Server

# main.py
from fastapi import FastAPI
from anthropic import Anthropic
from qdrant_client import QdrantClient
import os

app = FastAPI()
client = Anthropic(api_key=os.environ["ANTHROPIC_API_KEY")]
qdrant = QdrantClient(url=os.environ["QDRANT_URL"])

@app.post("/query")
async def query(question: str) -> dict:
    # 1. Search relevant documents in Qdrant
    results = qdrant.search(
        collection_name="research_docs",
        query_vector=embed(question),
        limit=5
    )
    context = "\n\n".join([r.payload["text"] for r in results])

    # 2. Generate answer using Claude
    response = client.messages.create(
        model="claude-3-5-sonnet-20241022",
        max_tokens=1000,
        messages=[{
            "role": "user",
            "content": f"Context:\n{context}\n\nQuestion: {question}"
        }]
    )
    return {"answer": response.content[0].text}

As long as requirements.txt is included, Railway automatically sets up the Python environment.


Cost Calculation

Railway Pricing (as of April 2026):

  • Hobby Plan: $5/month — 512MB RAM, 1 vCPU, 1GB disk
  • Pro Plan: $20/month — 8GB RAM, 8 vCPU, 100GB disk
  • Charges based only on actual usage (running time × resources)

A lightweight internal RAG API (for quant team use) is well suited for Hobby at $5/month. FastAPI doesn’t use much memory, and Qdrant can be hosted as a separate instance (Qdrant Cloud free tier).

Comparison with Heroku:

  • Heroku Basic: $7/month (512MB RAM, single server)
  • Railway Hobby: $5/month + usage-based, multiple services deployments

Step-by-Step Setup

1. Code Structure

my-rag-api/
├── main.py
├── requirements.txt
└── (Optional: Dockerfile)

requirements.txt

fastapi==0.115.0
uvicorn==0.32.0
anthropic==0.40.0
qdrant-client==1.12.0

2. Deploy on Railway

  1. Visit railway.app and log in with GitHub
  2. New Project → Deploy from GitHub repo
  3. Select your repository
  4. Add environment variables: ANTHROPIC_API_KEY, QDRANT_URL
  5. Automatic deployment completes

Railway detects main.py running FastAPI and automatically runs uvicorn main:app --host 0.0.0.0 --port $PORT.

3. Custom Domain (Optional)

Default domain is xxx.railway.app. Custom domains are linked via CNAME records in the Railway dashboard.


Useful Pattern: Slack Webhook Integration

For internal quant RAG bots, linking with Slack slash commands is convenient.

@app.post("/slack/command")
async def slack_command(
    text: str = Form(...),
    response_url: str = Form(...)
) -> dict:
    # Process asynchronously and send response
    asyncio.create_task(
        process_and_respond(text, response_url)
    )
    return {"text": "Analyzing... (response will be sent shortly)"}

If you type /rag today BTC on-chain summary, RAG searches stored documents and answers.


Conclusion

Railway is a great service for deploying Python servers without complex DevOps. It automates CI/CD via GitHub integration and manages environment variables cleanly.

For internal research tools or team shared APIs, Hobby plan at $5/month is enough. As traffic grows, upgrade to Pro or increase instances.

Sign up with Railway (referral link) — gets you $20 credit.

What is an LLM Agent? From Concept to Quant Investment Applications

RunPod vs Vast.ai: Real-world Comparison of Local LLM and GPU Rental for Backtesting

Bitcoin News Sentiment Analysis: Techniques to Read Market Sentiment and Investment Strategies

Share X Telegram
#railway #deployment #rag #fastapi #python

Newsletter

Weekly Quant & Market Insights

Get market analysis, quant strategy ideas, and AI & data tool insights delivered to your inbox.

Subscribe →
More in this category TECH →