Processing sensitive data in public clouds raises legitimate privacy concerns. Learn how to build a fully sovereign and controlled alternative to commercial AI models in just 20 minutes, using open-source software, the DAX platform, and the OpenWork interface in your own cloud.
Why build a sovereign alternative to Claude?
Commercial language models, while impressive in their capabilities, come with significant limitations. Using external APIs requires transmitting sensitive corporate or private data to external providers' servers. For many organizations that adhere to strict security standards or are subject to legal regulations, this is an insurmountable barrier. Additionally, dependence on pricing and availability of external services (so‑called vendor lock‑in) carries operational risk.
The solution to this problem is building a sovereign AI infrastructure. By using open source (OSS) software deployed in a controlled cloud environment, we gain full control over data flow. While the market is dominated by ready‑made solutions such as Claude Fable 5 or advanced commercial models, modern tools allow us to implement an efficient alternative locally or in a private cloud in just a few minutes. We no longer need to ask ourselves, what is ChatGPT and how to use it safely – we can simply create our own isolated work environment.
What is DAX and how does it simplify GPU management?
The biggest challenge when deploying open‑source models on your own is configuring the hardware infrastructure. This process usually requires installing complex NVIDIA drivers, setting up a Docker environment, managing storage, and downloading files that are tens of gigabytes in size, which often get interrupted.
Enter DAX (available on GitHub: github.com/dagploy/dax). It is a specialized tool that fully automates provisioning GPU instances in a compute cloud (e.g., Google Cloud Platform). DAX eliminates the need for manual driver and container configuration, allowing you to launch a ready‑made inference environment in minutes. Before you start, simply ensure that your cloud account (e.g., GCP project) has an active GPU quota (GPU Quota).
You can learn more about the configuration process and see a demonstration of this solution in the video material available on YouTube: Watch the video tutorial.
Installation and deployment of models step by step
The deployment process for a Claude alternative consists of three main steps: installing the DAX tool, downloading the images and the GPT OSS 20B model (or a comparable one), and then launching the VLLM inference engine.
Step 1: Install DAX
In the first step, download and install the DAX tool on your workstation or managing server according to the instructions available in the project repository. The entire initial configuration should take no more than 5 minutes.
Step 2: Download GPT OSS 20B and VLLM
Before launching virtual machines, it's advisable to locally cache the Docker images and model weights. In total, about 100 GB of data will need to be downloaded, so a stable and fast internet connection is essential. If you are regularly interested in Linux system administration, these commands will be fully understandable to you.
First we pull and cache the VLLM engine image and the Open WebUI interface:
dax run download_docker vllm/vllm-openai:nightly,ghcr.io/open-webui/open-webui:main --images vllm-lib --image-size 100Next we download the GPT OSS 20B model weights directly from Hugging Face:
dax run download_hf openai/gpt-oss-20b --image-size 50After downloading the files we can safely connect to our GCP virtual machine using SSH tunneling or a dedicated VPN connection.
Step 3: Run inference on the virtual machine
When the data is ready, we initiate the inference process with the following DAX command, which configures the machine stack:
dax run create_vm_inference --stack-name gptoss --config-json '{"images":["models--openai"]}'Integration with OpenWork: User interface configuration
With a running backend in the cloud, we need a convenient interface for daily work with the model. For this purpose we will use the OpenWork tool, which can be installed directly on your laptop. You can download the latest version of the application from the official releases on GitHub: OpenWork releases.
To connect the OpenWork application to our GPT OSS 20B instance running on the virtual machine, we need to modify its configuration file. On Linux systems we edit the file using a text editor (e.g., Vim):
vim ~/.config/opencode/opencode.jsonIn the configuration file you must precisely match connection parameters such as model, port, and the URL where the service is available. An example file structure looks like this:
{
"$schema": "https://opencode.ai/config.json",
"provider": {
"my-api": {
"npm": "@ai-sdk/openai-compatible",
"name": "GPT OSS 20B",
"options": {
"baseURL": "http://localhost:8000/v1"
},
"models": {
"model-name": {
"name": "model"
}
}
}
}
}Qwen 3.6 27B – a higher‑performance alternative
While GPT OSS 20B is an excellent starting point, practical tests show that the Qwen 3.6 27B model performs noticeably better when directly integrated with the OpenWork ecosystem. It offers higher answer precision and better context understanding during code generation and text analysis.
Technical details, documentation, and model files can be found directly on its Hugging Face page: Qwen/Qwen3.6-27B.
Opportunities and limitations of Open Source solutions
Building alternatives to closed AI systems using open‑source software opens huge possibilities for organizations, but also comes with certain challenges to keep in mind:
- Advantages: Full sovereignty and data security, no subscription fees for API queries, ability to fine‑tune the model to the specifics of a given industry, no risk of sudden service shutdown by an external provider.
- Limitations: Requirement for technical knowledge in system administration and containerization, ongoing costs of maintaining GPU infrastructure in the cloud (regardless of usage intensity), need to independently manage updates and network security of the machines.
If your organization needs professional assistance in designing and deploying sovereign AI infrastructure, the experts at the Dagploy platform offer comprehensive support. You can find more information about the platform at dagploy.com, and contact the engineering team directly at dagploy.com/contact.
Comments