top of page

Punk Band of Software Development - Vibe Coding with All Hands AI, Devstral, and vLLM


Software development is evolving rapidly, and Artificial Intelligence is no longer a futuristic idea reserved for academia—it’s becoming an integral part of everyday engineering workflows. This shift is not just about making autocomplete smarter; it's about transforming development into a fluid, intelligent, and collaborative experience. This movement, which we call “Vibe Coding,” is redefining the entire dev process.



What You’ll Discover in This Post


  • Understanding Vibe Coding – What it means and why it's the next frontier in software development.

  • Overview of All Hands AI – Key features, security-first design, and integration benefits with Devstral and VLLM.

  • A detailed setup guide – From GPU provisioning to model serving with step-by-step instructions.

  • Visual documentation – Screenshots, dashboards, and UI views to aid setup and monitoring.

  • Real-world scenarios – Practical examples and productivity tips to harness the full power of this AI toolchain.


Vibe Coding: Rethinking the Developer Experience


Traditional coding often means hours of isolated work, repetitive debugging, and generating boilerplate code. But with AI, there's a chance to shift from monotony to flow. Vibe Coding centers around harnessing intelligent tools to build an experience that is seamless, creative, and deeply satisfying.

Imagine coding alongside an AI assistant that:

  • Proposes contextual optimizations proactively.

  • Detects subtle bugs before they become blockers.

  • Automatically writes boilerplate, tests, and structured refactors.

  • Helps teams stay in sync by suggesting improvements across shared codebases.

Vibe Coding is about staying in a state of flow—where deep work becomes natural. Tools like All Hands AI serve as intelligent, context-aware copilots that understand your intent and architecture, helping teams focus on what truly matters: building exceptional software.


Meet All Hands AI: Secure, Customizable, and Developer-Centric


In an age dominated by cloud-first services, All Hands AI is a powerful counterpoint. It’s a fully self-hostable, open-source AI coding assistant built with privacy, control, and extensibility in mind—ideal for teams wary of proprietary tools like GitHub Copilot or Cody.


What Sets It Apart?


  • Full Self-Hosting: Keep your codebase and LLM processing within your own infrastructure. Essential for compliance-heavy industries.

  • Tailored Integrations: Configure it to your team’s specific workflows and even fine-tune it with private repo embeddings.

  • Data Sovereignty: Avoid vendor lock-in. Your intellectual property stays secure and within your control.

Think of it as a company-specific Copilot—trained on your internal best practices, libraries, and unique project DNA.


The Power Trio: All Hands AI + Devstral + VLLM on Denvr Cloud


Pairing All Hands AI with the right backend tech stack transforms it from helpful to indispensable. Enter Devstral, a purpose-built LLM for software engineering, and VLLM, a high-performance inference engine. Hosted on Denvr Cloud, this trio creates a best-in-class AI development platform.


Devstral: LLM Purpose-Built for Engineers

Created in partnership with Mistral AI, Devstral is optimized for agentic workflows and coding complexity:

  • Navigates codebases intelligently, identifying dependencies and relevant snippets.

  • Edits across multiple files, enabling holistic code refactors.

  • Drives AI agents that handle automation, testing, and static analysis.

Its strong performance on the SWE-bench benchmark cements it as one of the most capable open LLMs for dev teams.


VLLM: Accelerated LLM Inference


Built for scale and speed, VLLM ensures fast responses from large models like Devstral. Thanks to techniques like PagedAttention, it offers:

  • High throughput

  • Low latency

  • Efficient GPU memory usage

It’s the perfect engine to fuel real-time interactions with your AI assistant.


Denvr Cloud: Optimized Hosting for AI Workloads


Want to deploy your stack at scale? Denvr Cloud offers:

  • 🚀 Top-tier GPUs like A100, H100, and Gaudi2.

  • 🛠️ Container-native workflows for CI/CD, Docker, and Kubernetes.

  • 🔐 Enterprise-ready isolation, perfect for regulated teams.

Whether you're prototyping solo or scaling across an organization, Denvr Cloud is a battle-tested choice.


Deployment Prerequisites


Before diving in, make sure you have:

  • A Denvr Cloud account with GPU resources.

  • (Optional) A Hugging Face account for model access.


Step-by-Step Guide to Deploying the Stack


1. Provision GPU VM on Denvr Cloud

  • Choose an A100 or H100 instance.

  • Recommended OS: Ubuntu 22.04.

  • Ensure Docker and SSH access are enabled.

  • Add the user to the Docker group:

    sudo usermod -a -G docker ubuntu

  • Clone the deployment repo:

    git clone https://github.com/denvrdata/examples.git


2. Install VLLM and Dependencies


pip install --upgrade vllm pyopenssl

3. Launch Devstral with VLLM

screen -S vllm
vllm serve mistralai/Devstral-Small-2505 \
  --tokenizer_mode mistral \
  --config_format mistral \
  --load_format mistral \
  --tool-call-parser mistral \
  --enable-auto-tool-choice \
  --tensor-parallel-size 8

Check availability with:

python3 llm_ping.py <IP>

Monitoring with Prometheus and Grafana

From the examples/all-hands-ai-mistral/ folder:

docker compose up -d
  • Access Grafana at http://<IP>:3000

  • Add Prometheus as a data source

  • Import dashboards:

    • vllm_dashboard.json

    • NVIDIA DCGM (GPU metrics)


Running the OpenHands App


Launch the app with:

docker run -it --rm --pull=always \
  -e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:0.38-nikolaik \
  -e LOG_ALL_EVENTS=true \
  -v /var/run/docker.sock:/var/run/docker.sock \
  -v ~/.openhands-state:/.openhands-state \
  -p 3001:3000 \
  --add-host host.docker.internal:host-gateway \
  --name openhands-app \
  docker.all-hands.dev/all-hands-ai/openhands:0.38
Note: Grafana uses port 3000, so OpenHands is mapped to 3001.

Visit: http://<IP>:3001

Configure Custom LLM Inference


All hands AI setup
LLM Configuration Option on first run

Within the app’s settings:

  • Model: openai/mistralai/Devstral-Small-2505

  • Base URL: http://<your-server>:8000/v1

  • API Key: token

Custom LLM Configuration
Custom LLM Configuration

Example: All Hands AI in Action


All Hands AI Coding with Monitoring in Action

Prompt

Build a To-Do app using FastAPI and React, with: Task add/delete/complete features One-page layout SQLite backend

Output


vLLM Monitoring data
vLLM Monitoring data

Nvidia DCGM monitoring
Nvidia GPU Monitoring data

A fully scaffolded, cleanly written codebase generated instantly by your AI pair programmer.


All hands AI generated code
Generated Code

Real-World Applications


  • Mass syntax corrections in monorepos

  • Custom security rule audits

  • Test generation for legacy code

  • Repository-aware smart suggestions


Conclusion


When you combine All Hands AI with Devstral and VLLM—especially on Denvr Cloud—you’re creating more than a Copilot alternative. You’re building a secure, extensible, AI-powered development platform tailored to your team’s unique needs.


Useful Links

Comments

Rated 0 out of 5 stars.
No ratings yet

Add a rating

Contact Us

bottom of page