In its official launch announcement, DeepSeek described the launch as the beginning of “the period of cost-effective 1M context size” and launched preview variations of two new fashions. V4 Flash is the smaller, quicker possibility, with 284 billion whole parameters. V4 Professional is the flagship, with 1.6 trillion parameters, which DeepSeek says delivers “efficiency rivaling the world’s prime closed-source fashions.” Each help a 1 million-token context window: sufficient to absorb a whole codebase or a protracted authorized doc in a single session.
The pricing is the place issues get fascinating. V4 Professional prices $3.48 per million output tokens. Anthropic fees $30 for an equal workload and OpenAI fees $25, whereas V4 Flash sits at $0.28. By Monday, DeepSeek had minimize V4 Professional costs by an extra 75% and decreased enter cache hit charges to a tenth of their earlier charge.
On benchmarks, DeepSeek’s technical report on Hugging Face places V4 roughly three to 6 months behind GPT-5.4 and Gemini 3.1 Professional on data assessments. On coding duties, it claims efficiency on par with GPT-5.4. The 2026 Stanford AI Index describes Chinese language AI labs as having successfully closed the efficiency hole with US frontier fashions general. “DeepSeek’s V4 preview is a severe flex,” Neil Shah, vice chairman of analysis at Counterpoint Analysis, advised CNBC.
Why the structure issues for automation groups
Two design choices in V4 are price understanding in case you run AI-powered workflows. The primary is what DeepSeek calls its Hybrid Consideration Structure. Most giant fashions lose accuracy throughout very lengthy classes: context degrades, solutions drift. V4 makes use of token-wise compression mixed with DeepSeek Sparse Consideration to repair this. In response to DeepSeek’s Hugging Face mannequin card, V4 Professional wants solely 27% of the inference compute and 10% of the reminiscence of its predecessor at 1 million tokens. For brokers operating multi-step workflows, that may be a direct minimize in value per activity.
The second is a deliberate give attention to agentic functionality. DeepSeek’s announcement states that V4 Professional is “open-source SOTA in Agentic Coding benchmarks” and notes that V4 has been built-in with a number of main AI agent instruments, together with Anthropic’s Claude Code. DeepSeek additionally says it has deployed V4 Professional internally as its personal coding agent. Counterpoint’s Wei Solar described its benchmark profile as suggesting “wonderful agent functionality at considerably decrease value.”
This connects to a query now we have lined at size: whether or not AI copilots truly take away work from groups, or simply add one other assessment step. The reply normally comes down as to if a mannequin causes throughout a full workflow or solely responds to particular person prompts. V4 is constructed for the previous. As we reported on 2026 automation tendencies, enterprise patrons need to know whether or not AI options minimize admin and enhance execution. A succesful mannequin at open-source value modifications what’s price automating: duties too costly to run constantly via an AI agent begin to look viable. Mannequin value is among the predominant causes agentic deployments stall earlier than they scale.
The open-source case, and what to observe for
The MIT licence means any organisation can obtain the weights, run the mannequin by itself servers, and pay nothing per question. For regulated industries with laborious knowledge residency necessities, a 1 million-token mannequin that processes contracts, affected person information, or case recordsdata on-premises is a special proposition from something out there at this value level six months in the past. IT groups reviewing their AI readiness throughout the UC stack ought to issue that in now, even when deployment is months away.
The constraints are actual. V4 handles textual content solely — no audio, photographs, or video. Operating 1.6 trillion parameters on-premises calls for severe {hardware}. DeepSeek educated V4 in partnership with Huawei, which confirmed in a press release that its Ascend 950-based Supernode clusters supported the work. Meaning V4 runs on Chinese language home chips, not Nvidia — related for organisations with US export compliance guidelines or geopolitical procurement insurance policies. The US authorities has accused Chinese language AI labs of large-scale IP distillation from American fashions. Some shopping for groups might want to work via that earlier than they’ll proceed.
Rishav Ganguli, founding father of New Daybreak AI, spoke to The Nationwide concerning the wider shift this week:
“For the final two years, lots of technique has been constructed on the idea {that a} small variety of US labs would sit on the prime of a steep functionality curve, and everybody else would pay to hire from them. That assumption is being repriced in actual time.”
What productiveness groups ought to do now
Gartner forecasts that 40% of enterprise functions will embrace task-specific AI brokers by the tip of 2026, up from underneath 5% in 2025. Each one runs on a mannequin. When that mannequin’s value drops sharply and its weights are freely out there, the enterprise case for automation will get simpler to put in writing.
V4 shouldn’t be a completed product. It’s a preview, text-only, and trails the highest closed-source fashions on some duties. DeepSeek has not given a timeline for a remaining launch. However for groups wanting to check agentic automation with out committing to a proprietary API contract, the mix of functionality, context size, and value is difficult to disregard. Run a parallel workflow take a look at towards your present setup and see whether or not the associated fee distinction holds at your quantity.







