DeepSeek: The open-source challenger and what it means for business

DeepSeek, a Chinese AI startup, has made waves with the launch of models like DeepSeek-R1, which rival industry giants like OpenAI in performance while reportedly being developed at a fraction of the cost. This disruptive achievement has sent shockwaves through the AI landscape, raising questions about the return on investment (ROI) for closed-source models. With training costs for DeepSeek-R1 reported at just $6 million, business leaders are now reevaluating what this means for their organization’s approach to AI and how it might reshape their budget and strategy moving forward. As open-source AI becomes a more viable alternative, companies must reassess their AI strategies and ask:

If DeepSeek achieves comparable performance at 3–5% of the cost of OpenAI’s models, how does this change our AI budget allocation?
Should we prioritize open-source models like DeepSeek-R1 for flexibility, or stick with proprietary systems for perceived reliability?
Does adopting DeepSeek require overhauling our existing AI infrastructure?

How DeepSeeks works in a nutshell

The DeepSeek team seems to have actually accomplished something nice, optimizing training as well as computational costs involved using reinforcement learning. DeepSeek used reinforcement learning (RL) to refine its model by teaching it to generate responses aligned with human preferences and real-world needs. It achieved this by implementing a reward system: for objective tasks like coding or math, rewards were given based on automated checks (e.g., running code tests), while for subjective tasks like creative writing, a reward model evaluated how well the output matched desired qualities like clarity and relevance.

The model repeatedly generated multiple outputs for the same input, learning to identify and prioritize better responses. This iterative process improved the model’s accuracy, reliability, and user alignment, making it more effective for practical applications and reducing the need for manual corrections.

How DeepSeek is redefining AI economics

DeepSeek seems to have accomplished two significant feats:

Its Mixture-of-Experts (MoE) architecture uses only 37 billion of 671 billion parameters per task, reducing computational inference costs by 90%.
The DeepSeek team states that only $6 million was incurred in training the model.

Let's dive a bit deeper into this to uncover the implications.

First of all, the 6 million that is quoted by a lot of media does not relate to total costs required to develop the model, it just refers to the actual training costs incurred. DeepSeek clarifies this in their paper:

Each trillion tokens took 180,000 GPU hours, or 3.7 days, using a cluster of 2,048 H800 GPUs.
The entire pre-training stage was completed in under two months, requiring 2.664 million GPU hours.
Adding 119,000 GPU hours for extending the model’s context capabilities and 5,000 GPU hours for final fine-tuning, the total training used 2.788 million GPU hours.
Assuming a rental cost of $2 per GPU hour, this brought the total training cost to $5.576 million. This figure excludes expenses for earlier research and experiments.

Second, while the stated training cost for DeepSeek-R1 is impressive, it isn’t directly relevant to most organizations as media outlets portray it to be. This is because organizations usually do not train AI models from scratch. Instead, most businesses deploy pre-trained models tailored to their specific use cases. As a result, DeepSeek’s accomplishments could significantly influence the cost of utilizing generative AI in several ways:

DeepSeek’s lower training costs translate to more affordable API pricing for organizations if they decide to opt for DeepSeek. More concretely, DeepSeek's R1 model is priced at $2.19 per million output tokens while OpenAI's o1 is $60 per million output tokens, making OpenAI’s model approximately 27 times more expensive than DeepSeek’s. However, businesses must consider the implications of hosting their data with a Chinese provider. This is something many U.S. and European companies should avoid due to data privacy concerns.
Since DeepSeek is open-source, cloud infrastructure providers are free to deploy the model on their platforms and offer it as an API service. Thanks to DeepSeek’s Mixture-of-Experts (MoE) architecture, which activates only a fraction of the model’s parameters per task, this could create a cost-effective alternative to proprietary APIs like OpenAI’s with the performance to rival their best performing model.
While DeepSeek’s $6 million figure lacks transparency around total associated costs (e.g., R&D and experimentation), it demonstrates that high-performance AI can be developed at significantly lower costs. This could inspire competitors to follow suit, increasing competition and driving down costs across the industry. For organizations focused on the application layer of AI, this means more options.

Choosing the correct pre-trained model

As mentioned earlier, most companies looking to use large language models (LLMs) rely on pre-trained models rather than training their own from scratch. However, one detail often overlooked by business leaders is that while DeepSeek-R1, the company’s best-performing model, is open-source and accessible, it comes with significant hardware requirements. The model itself is over 700 GB, meaning it requires a high-performance setup with advanced GPUs—an investment that can easily exceed $100,000.

For organizations considering the open-source route with DeepSeek, it’s critical to carefully evaluate which version of the R1 model aligns with their needs and capabilities. DeepSeek offers models of varying sizes (measured in parameters), much like car engines:

Comparison table of DeepSeek AI models (Full R1, Distilled Model, and Tiny Models) highlighting their costs, best use cases, infrastructure requirements, and potential risks. The Full R1 model is built for high-end reasoning but requires expensive infrastructure, while the Distilled and Tiny models offer more cost-effective alternatives for AI applications such as automation, customer support, and document classification.” — This table compares DeepSeek AI models—Full R1 (671B parameters), Distilled Model (70B), and Tiny Model (1.5B)—based on cost, use cases, infrastructure requirements, and risks. Full R1 offers top-tier reasoning but demands high-end GPUs and expert management. Distilled models balance performance and cost, making them ideal for specialized AI tasks like coding assistants and automation. Tiny models are the most affordable, suited for lightweight AI applications but may require fine-tuning for best performance.

Hosting DeepSeek-R1 in its full capacity (671B parameters) will likely require substantial infrastructure investments, such as high-end GPU clusters costing upwards of $100,000. However, smaller variants of the model can be deployed with less demanding hardware, making them more accessible to mid-sized organizations.

Alternatively, businesses can explore cloud-hosted options to avoid upfront infrastructure costs altogether. Whether an overhaul is necessary depends on the organization’s current capabilities, but experimenting with smaller models or hybrid setups could allow businesses to integrate DeepSeek without disrupting existing workflows.

Why choose open-source as part of your AI strategy

The choice between open-source and closed-source AI models presents a nuanced decision for business leaders, each path offering distinct advantages and challenges.

Open-source AI essentially gives you the most control over the technology. Because you deploy the model in your own premises, you avoid data going out to a third-party provider. In addition, development of open-source is typically strong if the community surrounding it is active. The trade-off is that you need the organization to support it. You need the technical skills to be able to manage and adapt the models effectively and safeguard performance.

Closed-source AI models provide ready-to-deploy solutions with dedicated support, ensuring reliability and ease of use. These proprietary systems often incorporate cutting-edge features developed through substantial R&D investments. Yet, they may come with higher costs and less flexibility for customization.

Ultimately, the decision hinges on an organization’s strategic priorities, resources, and risk tolerance. For instance, a company prioritizing rapid deployment and support might lean towards closed-source solutions, while one seeking tailored functionalities and cost efficiency could find open-source models more appealing.

How we view open-source

At Datalumina, we prioritize open-source technologies because they reflect both our practical needs and the way we operate as a company. Being a small team with deep technical expertise, we have the capability to manage and monitor the systems we use without relying on restrictive proprietary solutions. Open-source provides the flexibility and transparency we need to adapt and scale without unnecessary overhead.

Additionally, our focus being part of a collaborative community naturally aligns with open-source principles. It allows us to work within a broader ecosystem of shared tools and knowledge, rather than building in isolation.

The GenAI Launchpad illustrates our commitment to open-source. We rely heavily on technologies such as FastAPI, PostgreSQL, Redis, and Docker because we know these tools are tried and tested and have the potential to help out our community the most. They should give our customers enough flexibility to support PoCs as well as production-ready workloads. That being said, the choice of LLM is largely use case dependent. As such, any AI-based technical setup needs to remain flexible to be able to experiment with open-source and closed-source LLMs. Key in this process is building robust evaluation frameworks that can help you accurately estimate the performance of the various LLMs used.

Conclusion: The future of AI is more competitive, accessible, and flexible

DeepSeek’s emergence as a high-performing, cost-effective open-source LLM represents a major shift in the AI landscape. By demonstrating that state-of-the-art AI can be developed at a fraction of the cost, DeepSeek has lowered the barriers to high-performance AI adoption. This increased accessibility is set to dramatically intensify competition among LLM providers, as more players—especially cloud infrastructure providers—build upon DeepSeek’s open-source foundation to offer cost-efficient AI services. The days of a few dominant AI providers monopolizing the space are quickly fading, giving businesses more choices than ever before.

For business leaders, this wave of competition presents both opportunities and challenges. The availability of open-source alternatives means that AI deployment no longer requires reliance on expensive, proprietary models. However, choosing the right approach—open-source vs. closed-source, self-hosted vs. cloud-based—depends on factors like budget, data privacy needs, and internal technical expertise. Those seeking maximum control and cost efficiency may lean toward open-source models, while those prioritizing ease of deployment and support may still opt for closed-source APIs. Regardless of the choice, one thing is clear: businesses can no longer afford to ignore the impact of open-source AI.

To navigate this evolving AI landscape successfully, organizations must prioritize flexibility in their AI strategy. Rather than committing to a single model or provider, building a technical setup that allows experimentation with multiple models, both open- and closed-source, is crucial. This ensures that companies can evaluate performance, costs, and trade-offs in real time, adapting to new developments without being locked into a single provider.