Blog

The unexpected benefits of AI PCs: why creativity could be the new productivity

Presented by HP

Creativity is quickly becoming the new measure of productivity. While AI is often framed as a tool for efficiency and automation, new research from MIT Sloan School of Management shows that generative AI enhances human creativity — when employees have the right tools and skills to use it effectively.

That’s where AI PCs come in. These next-generation laptops combine local AI processing with powerful Neural Processing Units (NPUs), delivering the speed and security that knowledge workers expect while also unlocking new creative possibilities. By handling AI tasks directly on the device, AI PCs minimize latency, protect sensitive data, and lower energy consumption.

Teams are already proving the impact. Marketing teams are using AI PCs to generate campaign assets in hours instead of weeks. Engineers are shortening design and prototyping cycles. Sales reps are creating personalized proposals onsite, even without cloud access. In each case, AI PCs are not just accelerating workflows — they’re sparking fresh ideas, faster iteration, and more engaged teams.

The payoff is clear: creativity that translates into measurable business outcomes, from faster time-to-market and stronger compliance to deeper customer engagement. Still, adoption is uneven, and the benefits aren’t yet reaching the wider workforce.

Early creative benefits, but a divide remains

New Morning Consult and HP research shows nearly half of IT decision makers (45%) already use AI PCs for creative assistance, with almost a third (29%) using them for tasks like image generation and editing. That’s not just about efficiency — it’s about bringing imagination into everyday workflows.

According to HP’s 2025 Work Relationship Index, fulfillment is the single biggest driver of a healthy work relationship, outranking even leadership. Give employees tools that let them create, not just execute tasks, and you unlock productivity, satisfaction, retention, and optimism. The same instinct that drives workers to build outside the office is the one companies can harness inside it.

The challenge is that among broader knowledge workers, adoption is still low, just 29% for creative assistance and just 19% for image generation. This creative divide means the full potential of AI PCs hasn’t reached the wider workforce. For CIOs, the opportunity isn’t just deploying faster machines — it’s fostering a workplace culture where creativity drives measurable business value.

Creative benefits of AI PCs

So when you put AI PCs in front of the employees who embrace the possibilities, what does that look like in practice? Early adopters are already seeing AI PCs reshape how creative work gets done.

Teams dream up fresh ideas, faster. AI PCs can spark new perspectives and out-of-the-box solutions, enhancing human creativity rather than replacing it. With dedicated NPUs handling AI workloads, employees stay in flow without interruptions. Battery life is extended, latency drops, and performance improves — allowing teams to focus on ideas, not wait times.

On-device AI is opening new creative mediums, from visual design to video production to music editing, and videos, photos, and presentations that can be generated, edited, and refined in real time.

Plus, AI workloads like summarization, transcription, and code generation run instantly without relying on cloud APIs. That means employees can work productively in low-bandwidth or disconnected environments, removing downtime risks, especially for mobile workforces and global deployments.

And across the organization, AI PCs mean real-world, measurable business outcomes.

Marketing: AI PCs enable creative teams to generate ad variations, social content, and campaign assets in minutes instead of days, reducing dependence on external agencies. And that leads to faster campaign launches, reduced external vendor spend, and increased pipeline velocity.

Product and engineering: Designers/engineers can prototype in CAD, generate 3D mockups, or run simulations locally with on-device AI accelerators, shortening feedback loops. That means reduced iteration cycles, faster prototyping, and faster time-to-market.

Sales/customer engagement: Reps can use AI PCs to generate real-time proposals, personalized presentations, or analyze contracts offline at client sites, even without cloud connection. This generates faster deal cycles, higher client engagement, and a shorter sales turnaround.

From efficiency to fulfillment

AI PCs are more than just a performance upgrade. They’re reshaping how people approach and experience work. By giving employees tools that spark creativity as well as productivity, organizations can unlock faster innovation, deeper engagement, and stronger retention.

For CIOs, the opportunity goes beyond efficiency gains. The true value of AI PCs won’t be measured in speed or specs, but in how they open new possibilities for creation, collaboration, and competition — helping teams not just work faster, but work more creatively and productively.

Sponsored articles are content produced by a company that is either paying for the post or has a business relationship with VentureBeat, and they’re always clearly marked. For more information, contact sales@venturebeat.com.

Ekim 21, 2025
AI’s financial blind spot: Why long-term success depends on cost transparency

Presented by Apptio, an IBM company

When a technology with revolutionary potential comes on the scene, it’s easy for companies to let enthusiasm outpace fiscal discipline. Bean counting can seem short-sighted in the face of exciting opportunities for business transformation and competitive dominance. But money is always an object. And when the tech is AI, those beans can add up fast.

AI’s value is becoming evident in areas like operational efficiency, worker productivity, and customer satisfaction. However, this comes at a cost. The key to long-term success is understanding the relationship between the two — so you can ensure that the potential of AI translates into real, positive impact for your business.

The AI acceleration paradox

While AI is helping to transform business operations, its own financial footprint often remains obscure. If you can’t connect costs to impact, how can you be sure your AI investments will drive meaningful ROI? This uncertainty makes it no surprise that in the 2025 Gartner® Hype Cycle™ for Artificial Intelligence, GenAI has moved into the “Trough of Disillusionment” .

Effective strategic planning depends on clarity. In its absence, decision-making falls back on guesswork and gut instinct. And there’s a lot riding on these decisions. According to Apptio research, 68% of technology leaders surveyed expect to increase their AI budgets, and 39% believe AI will be their departments’ biggest driver of future budget growth.

But bigger budgets don’t guarantee better outcomes. Gartner® also reveals that “despite an average spend of $1.9 million on GenAI initiatives in 2024, fewer than 30% of AI leaders say their CEOs are satisfied with the return on investment.” If there’s no clear link between cost and outcome, organizations risk scaling investments without scaling the value they’re meant to create.

To move forward with well-founded confidence, business leaders in finance, IT, and tech must collaborate to gain visibility into AI’s financial blind spot.

The hidden financial risks of AI

The runaway costs of AI can give IT leaders flashbacks to the early days of public cloud. When it’s easy for DevOps teams and business units to procure their own resources on an OpEx basis, costs and inefficiencies can quickly spiral. In fact, AI projects are avid consumers of cloud infrastructure — while incurring additional costs for data platforms and engineering resources. And that’s on top of the tokens used for each query. The decentralized nature of these costs makes them particularly difficult to attribute to business outcomes.

As with the cloud, the ease of AI procurement quickly leads to AI sprawl. And finite budgets mean that every dollar spent represents an unconscious tradeoff with other needs. People worry that AI will take their job. But it’s just as likely that AI will take their department’s budget.

Meanwhile, according to Gartner®, “Over 40% of agentic AI projects will be canceled by end of 2027, due to escalating costs, unclear business value or inadequate rish controls”. But are those the right projects to cancel? Lacking a way to connect investment to impact, how can business leaders know whether those rising costs are justified by proportionally greater ROI? ?

Without transparency into AI costs, companies risk overspending, under-delivering, and missing out on better opportunities to drive value.

Why traditional financial planning can't handle AI

As we learned with cloud, we see that traditional static budget models are poorly suited for dynamic workloads and rapidly scaling resources. The key to cloud cost management has been tagging and telemetry, which help companies attribute each dollar of cloud spend to specific business outcomes. AI cost management will require similar practices. But the scope of the challenge goes much further. On top of costs for storage, compute, and data transfer, each AI project brings its own set of requirements — from prompt optimization and model routing to data preparation, regulatory compliance, security, and personnel.

This complex mix of ever-shifting factors makes it understandable that finance and business teams lack granular visibility into AI-related spend — and IT teams struggle to reconcile usage with business outcomes. But it’s impossible to precisely and accurately track ROI without these connections.

The strategic value of cost transparency

Cost transparency empowers smarter decisions — from resource allocation to talent deployment.

Connecting specific AI resources with the projects that they support helps technology decision-makers ensure that the most high-value projects are given what they need to succeed. Setting the right priorities is especially critical when top talent is in short supply. If your highly compensated engineers and data scientists are spread across too many interesting but unessential pilots, it’ll be hard to staff the next strategic — and perhaps pressing — pivot.

FinOps best practices apply equally to AI. Cost insights can surface opportunities to optimize infrastructure and address waste whether by right-sizing performance and latency to match workload requirements, or by selecting a smaller, more cost-effective model instead of defaulting to the latest large language model (LLM). As work proceeds, tracking can flag rising costs so leaders can pivot quickly in more-promising directions as needed. A project that makes sense at X cost might not be worthwhile at 2X cost.

Companies that adopt a structured, transparent, and well-governed approach to AI costs are more likely to spend the right money in the right ways and see optimal ROI from their investment.

TBM: An enterprise framework for AI cost management

Transparency and control over AI costs depend on three practices:

IT financial management (ITFM): Managing IT costs and investments in alignment with business priorities

FinOps: Optimizing cloud costs and ROI through financial accountability and operational efficiency

Strategic portfolio management (SPM): Prioritizing and managing projects to better ensure they deliver maximum value for the business

Collectively, these three disciplines make up Technology Business Management (TBM) — a structured framework that helps technology, business, and finance leaders connect technology investments to business outcomes for better financial transparency and decision-making.

Most companies are already on the road to TBM, whether they realize it or not. They may have adopted some form of FinOps or cloud cost management. Or they might be developing strong financial expertise for IT. Or they may rely on Enterprise Agile Planning or Strategic Portfolio Management project management to deliver initiatives more successfully. AI can draw on — and impact — all of these areas. By unifying them under one umbrella with a common model and vocabulary, TBM brings essential clarity to AI costs and the business impact they enable.

AI success depends on value — not just velocity. The cost transparency that TBM provides offers a road map that can help business and IT leaders make the right investments, deliver them cost-effectively, scale them responsibly, and turn AI from a costly mistake into a measurable business asset and strategic driver.

Sources : Gartner® Press Release, Gartner® Predicts Over 40% of Agentic AI Projects Will Be Canceled by End of 2027, June 25, 2025 https://www.Gartner®.com/en/newsroom/press-releases/2025-06-25-Gartner®-predicts-over-40-percent-of-agentic-ai-projects-will-be-canceled-by-end-of-2027

GARTNER® is a registered trademark and service mark of Gartner®, Inc. and/or its affiliates in the U.S. and internationally and is used herein with permission. All rights reserved.

Ajay Patel is General Manager, Apptio and IT Automation at IBM.

Sponsored articles are content produced by a company that is either paying for the post or has a business relationship with VentureBeat, and they’re always clearly marked. For more information, contact sales@venturebeat.com.

Ekim 21, 2025
Claude Code comes to web and mobile, letting devs launch parallel jobs on Anthropic’s managed infra

Vibe coding is evolving and with it are the leading AI-powered coding services and tools, including Anthropic’s Claude Code.

As of today, the service will be available via the web and, in preview, on the Claude iOS app, giving developers access to additional asynchronous capabilities. Previously, it was available through the terminal on developers' PCs with support for Git, Docker, Kubernetes, npm, pip, AWS CLI, etc., and as an extension for Microsoft's open source VS Code editor and other JetBrains-powered integrated development environments (IDEs) via Claude Agent.

“Claude Code on the web lets you kick off coding sessions without opening your terminal,” Anthropic said in a blog post. “Connect your GitHub repositories, describe what you need, and Claude handles the implementation. Each session runs in its own isolated environment with real-time progress tracking, and you can actively steer Claude to adjust course as it’s working through tasks.”

This allows users to run coding projects asynchronously, a trend that many enterprises are looking for.

The web version of Claude Code, currently in research preview, will be available to Pro and Max users. However, web Claude Code will be subject to the same rate limits as other versions. Anthropic throttled rate limits to Claude and Claude Code after the unexpected popularity of the coding tool in July, which enabled some users to run Claude Code overnight.

Anthropic is now ensuring Claude Code comes closer to matching the availability of rival OpenAI's Codex AI coding platform, powered by a variant of GPT-5, which launches on mobile and the web back in mid September 2025.

Parallel usage

Anthropic said running Claude Code in the cloud means teams can “now run multiple tasks in parallel across different repositories from a single interface and ship faster with automatic PR creation and clear change summaries.”

One of the big draws of coding agents is giving developers the ability to run multiple coding projects, such as bugfixes, at the same time. Google’s two coding agents, Jules and Code Assist, both offer asynchronous code generation and checks. Codex from OpenAI also lets people work in parallel.

Anthropic said bringing Claude Code to the web won’t disrupt workflows, but noted running tasks in the cloud work best for tasks such as answering questions around projects and how repositories are mapped, bugfixes and for routine, well-defined tasks, and backend changes to verify any adjustments.

While most developers will likely prefer to use Claude Code on a desktop, Anthropic said the mobile version could encourage more users to “explore coding with Claude on the go.”

Isolated environments

Anthropic insisted that Claude Code tasks on the cloud will have the same level of security as the earlier version. It runs on an “isolated sandbox environment with network and filesystem restrictions.”

Interactions go through a secure proxy service, which the company said ensures the model only accesses authorized repositories.

Enterprise users can customize which domains Claude Code can connect to.

Claude Code is powered by Claude Sonnet 4.5, which Anthropic claims is the best coding model around. The company recently made Claude Haiku 4.5, a smaller version of Claude that also has strong coding capabilities, available to all Claude subscribers, including free users.

Ekim 20, 2025
Adobe Foundry wants to rebuild Firefly for your brand — not just tweak it

Hoping to attract more enterprise teams to its ecosystem, Adobe launched a new model customization service called Adobe AI Foundry, which would create bespoke versions of its flagship AI model, Firefly.

Adobe AI Foundry will work with enterprise customers to rearchitect and retrain Firefly models specific to the client. AI Foundry version models are different from custom Firefly models in that Foundry models understand multiple concepts compared to custom models with only a single concept. These models will also be multimodal, offering a wider use case than custom Firefly models, which can only ingest and respond with images.

Adobe AI Foundry models, with Firefly at its base, will know a company’s brand tone, image and video style, products and services and all its IP. The models will generate content based on this information for any use case the company wants.

Hannah Elsakr, vice president, GenAI New Business Ventures at Adobe, told VentureBeat that the idea to set up AI Foundry came because enterprise customers wanted more sophisticated custom versions of Firefly. But with how complex the needs of enterprises are, Adobe will be doing the rearchitecting rather than handing the reins over to customers.

“We will retrain our own Firefly commercially safe models with the enterprise IP. We keep that IP separate. We never take that back into the base model, and the enterprise itself owns that output,” Elsakr said.

Adobe will deploy the Foundry version of Firefly through its API solution, Firefly Services.

Elsakr likened AI Foundry to an advisory service, since Adobe will have teams working directly with enterprise customers to retrain the model.

Deep tuning

Elsakr refers to Foundry as a deep tuning method because it goes further than simply fine-tuning a model.

“The way we think about it, maybe more layman's terms, is that we're surgically reopening the Firefly-based models,” Elsakr said. “So you get the benefit of all the world's knowledge from our image model or a video model. We're going back in time and are bringing in the IP from the enterprise, like a brand. It could be footage from a shot style, whatever they have a license to contribute. We then retrain. We call this continuous pre-training, where we overweigh the model to dial some things differently. So we're literally retraining our base model, and that's why we call it deep tuning instead of fine-tuning.”

Part of the training pipeline involves Adobe’s embedded teams working with the company to identify the data they would need. Then the data is securely transferred and ingested before being tagged. It is fed to the base model, and then Adobe begins a pre-training model run.

Elsakr maintains the Foundry versions of Firefly will not be small or distilled models. Often, the additional data from companies expands the parameters of Firefly.

Two early customers of Adobe AI Foundry are Home Depot and Walt Disney Imagineering, the research and development arm of Disney for its theme parks.

“We are always exploring innovative ways to enhance our customer experience and streamline our creative workflows. Adobe’s AI Foundry represents an exciting step forward in embracing cutting-edge technologies to deepen customer engagement and deliver impactful content across our digital channels,” said Molly Battin, senior vice president and chief marketing officer at The Home Depot.

More customization

Enterprises often turn to fine-tuning and model customization to bring large language models with their vast external knowledge closer to their company’s needs. Fine-tuning also enables enterprise users to utilize models only in the context of their organization’s data, so the model doesn’t respond with text wholly unrelated to the business.

Most organizations, however, do the fine-tuning themselves. They connect to the model’s API and begin retraining it to answer based on their ground truth or their preferences. Several methods for fine-tuning exist, including some that can be done with just a prompt. Other model providers also try to make it easier for their customers to fine-tune models, such as OpenAI with its o4-mini reasoning model.

Elsakr said she expects some companies will have three versions of Firefly: the Foundry version for most projects, a custom Firefly for specific single-concept use cases, and the base Firefly because some teams want a model less encumbered by corporate knowledge.

Ekim 20, 2025
The teacher is the new engineer: Inside the rise of AI enablement and PromptOps
As more companies quickly begin using gen AI, it’s important to avoid a big mistake that could impact its effectiveness: Proper onboarding. Companies spend time and money training new human workers to succeed, but when they use large language model (LLM) helpers, many treat them like simple tools that need no explanation.

This isn't just a waste of resources; it's risky. Research shows that AI has advanced quickly from testing to actual use in 2024 to 2025, with almost a third of companies reporting a sharp increase in usage and acceptance from the previous year.

Probabilistic systems need governance, not wishful thinking

Unlike traditional software, gen AI is probabilistic and adaptive. It learns from interaction, can drift as data or usage changes and operates in the gray zone between automation and agency. Treating it like static software ignores reality: Without monitoring and updates, models degrade and produce faulty outputs: A phenomenon widely known as model drift. Gen AI also lacks built-in organizational intelligence. A model trained on internet data may write a Shakespearean sonnet, but it won’t know your escalation paths and compliance constraints unless you teach it. Regulators and standards bodies have begun pushing guidance precisely because these systems behave dynamically and can hallucinate, mislead or leak data if left unchecked.

The real-world costs of skipping onboarding

When LLMs hallucinate, misinterpret tone, leak sensitive information or amplify bias, the costs are tangible.
- Misinformation and liability: A Canadian tribunal held Air Canada liable after its website chatbot gave a passenger incorrect policy information. The ruling made it clear that companies remain responsible for their AI agents’ statements.
- Embarrassing hallucinations: In 2025, a syndicated “summer reading list” carried by the Chicago Sun-Times and Philadelphia Inquirer recommended books that didn’t exist; the writer had used AI without adequate verification, prompting retractions and firings.
- Bias at scale: The Equal Employment Opportunity Commission (EEOCs) first AI-discrimination settlement involved a recruiting algorithm that auto-rejected older applicants, underscoring how unmonitored systems can amplify bias and create legal risk.
- Data leakage: After employees pasted sensitive code into ChatGPT, Samsung temporarily banned public gen AI tools on corporate devices — an avoidable misstep with better policy and training.
The message is simple: Un-onboarded AI and un-governed usage create legal, security and reputational exposure.

Treat AI agents like new hires

Enterprises should onboard AI agents as deliberately as they onboard people — with job descriptions, training curricula, feedback loops and performance reviews. This is a cross-functional effort across data science, security, compliance, design, HR and the end users who will work with the system daily.
1. Role definition. Spell out scope, inputs/outputs, escalation paths and acceptable failure modes. A legal copilot, for instance, can summarize contracts and surface risky clauses, but should avoid final legal judgments and must escalate edge cases.
2. Contextual training. Fine-tuning has its place, but for many teams, retrieval-augmented generation (RAG) and tool adapters are safer, cheaper and more auditable. RAG keeps models grounded in your latest, vetted knowledge (docs, policies, knowledge bases), reducing hallucinations and improving traceability. Emerging Model Context Protocol (MCP) integrations make it easier to connect copilots to enterprise systems in a controlled way — bridging models with tools and data while preserving separation of concerns. Salesforce’s Einstein Trust Layer illustrates how vendors are formalizing secure grounding, masking, and audit controls for enterprise AI.
3. Simulation before production. Don’t let your AI’s first “training” be with real customers. Build high-fidelity sandboxes and stress-test tone, reasoning and edge cases — then evaluate with human graders. Morgan Stanley built an evaluation regimen for its GPT-4 assistant, having advisors and prompt engineers grade answers and refine prompts before broad rollout. The result: >98% adoption among advisor teams once quality thresholds were met. Vendors are also moving to simulation: Salesforce recently highlighted digital-twin testing to rehearse agents safely against realistic scenarios.
4. 4) Cross-functional mentorship. Treat early usage as a two-way learning loop: Domain experts and front-line users give feedback on tone, correctness and usefulness; security and compliance teams enforce boundaries and red lines; designers shape frictionless UIs that encourage proper use.
Feedback loops and performance reviews—forever

Onboarding doesn’t end at go-live. The most meaningful learning begins after deployment.
- Monitoring and observability: Log outputs, track KPIs (accuracy, satisfaction, escalation rates) and watch for degradation. Cloud providers now ship observability/evaluation tooling to help teams detect drift and regressions in production, especially for RAG systems whose knowledge changes over time.
- User feedback channels. Provide in-product flagging and structured review queues so humans can coach the model — then close the loop by feeding these signals into prompts, RAG sources or fine-tuning sets.
- Regular audits. Schedule alignment checks, factual audits and safety evaluations. Microsoft’s enterprise responsible-AI playbooks, for instance, emphasize governance and staged rollouts with executive visibility and clear guardrails.
- Succession planning for models. As laws, products and models evolve, plan upgrades and retirement the way you would plan people transitions — run overlap tests and port institutional knowledge (prompts, eval sets, retrieval sources).
Why this is urgent now

Gen AI is no longer an “innovation shelf” project — it’s embedded in CRMs, support desks, analytics pipelines and executive workflows. Banks like Morgan Stanley and Bank of America are focusing AI on internal copilot use cases to boost employee efficiency while constraining customer-facing risk, an approach that hinges on structured onboarding and careful scoping. Meanwhile, security leaders say gen AI is everywhere, yet one-third of adopters haven’t implemented basic risk mitigations, a gap that invites shadow AI and data exposure.

The AI-native workforce also expects better: Transparency, traceability, and the ability to shape the tools they use. Organizations that provide this — through training, clear UX affordances and responsive product teams — see faster adoption and fewer workarounds. When users trust a copilot, they use it; when they don’t, they bypass it.

As onboarding matures, expect to see AI enablement managers and PromptOps specialists in more org charts, curating prompts, managing retrieval sources, running eval suites and coordinating cross-functional updates. Microsoft’s internal Copilot rollout points to this operational discipline: Centers of excellence, governance templates and executive-ready deployment playbooks. These practitioners are the “teachers” who keep AI aligned with fast-moving business goals.

A practical onboarding checklist

If you’re introducing (or rescuing) an enterprise copilot, start here:
1. Write the job description. Scope, inputs/outputs, tone, red lines, escalation rules.
2. Ground the model. Implement RAG (and/or MCP-style adapters) to connect to authoritative, access-controlled sources; prefer dynamic grounding over broad fine-tuning where possible.
3. Build the simulator. Create scripted and seeded scenarios; measure accuracy, coverage, tone, safety; require human sign-offs to graduate stages.
4. Ship with guardrails. DLP, data masking, content filters and audit trails (see vendor trust layers and responsible-AI standards).
5. Instrument feedback. In-product flagging, analytics and dashboards; schedule weekly triage.
6. Review and retrain. Monthly alignment checks, quarterly factual audits and planned model upgrades — with side-by-side A/Bs to prevent regressions.
In a future where every employee has an AI teammate, the organizations that take onboarding seriously will move faster, safer and with greater purpose. Gen AI doesn’t just need data or compute; it needs guidance, goals, and growth plans. Treating AI systems as teachable, improvable and accountable team members turns hype into habitual value.

Dhyey Mavani is accelerating generative AI at LinkedIn.
Ekim 19, 2025
Abstract or die: Why AI enterprises can’t afford rigid vector stacks
Vector databases (DBs), once specialist research instruments, have become widely used infrastructure in just a few years. They power today's semantic search, recommendation engines, anti-fraud measures and gen AI applications across industries. There are a deluge of options: PostgreSQL with pgvector, MySQL HeatWave, DuckDB VSS, SQLite VSS, Pinecone, Weaviate, Milvus and several others.

The riches of choices sound like a boon to companies. But just beneath, a growing problem looms: Stack instability. New vector DBs appear each quarter, with disparate APIs, indexing schemes and performance trade-offs. Today's ideal choice may look dated or limiting tomorrow.

To business AI teams, volatility translates into lock-in risks and migration hell. Most projects begin life with lightweight engines like DuckDB or SQLite for prototyping, then move to Postgres, MySQL or a cloud-native service in production. Each switch involves rewriting queries, reshaping pipelines, and slowing down deployments.

This re-engineering merry-go-round undermines the very speed and agility that AI adoption is supposed to bring.

Why portability matters now

Companies have a tricky balancing act:
- Experiment quickly with minimal overhead, in hopes of trying and getting early value;
- Scale safely on stable, production-quality infrastructure without months of refactoring;
- Be nimble in a world where new and better backends arrive nearly every month.
Without portability, organizations stagnate. They have technical debt from recursive code paths, are hesitant to adopt new technology and cannot move prototypes to production at pace. In effect, the database is a bottleneck rather than an accelerator.

Portability, or the ability to move underlying infrastructure without re-encoding the application, is ever more a strategic requirement for enterprises rolling out AI at scale.

Abstraction as infrastructure

The solution is not to pick the "perfect" vector database (there isn't one), but to change how enterprises think about the problem.

In software engineering, the adapter pattern provides a stable interface while hiding underlying complexity. Historically, we've seen how this principle reshaped entire industries:
- ODBC/JDBC gave enterprises a single way to query relational databases, reducing the risk of being tied to Oracle, MySQL or SQL Server;
- Apache Arrow standardized columnar data formats, so data systems could play nice together;
- ONNX created a vendor-agnostic format for machine learning (ML) models, bringing TensorFlow, PyTorch, etc. together;
- Kubernetes abstracted infrastructure details, so workloads could run the same everywhere on clouds;
- any-llm (Mozilla AI) now makes it possible to have one API across lots of large language model (LLM) vendors, so playing with AI is safer.
All these abstractions led to adoption by lowering switching costs. They turned broken ecosystems into solid, enterprise-level infrastructure.

Vector databases are also at the same tipping point.

The adapter approach to vectors

Instead of having application code directly bound to some specific vector backend, companies can compile against an abstraction layer that normalizes operations like inserts, queries and filtering.

This doesn't necessarily eliminate the need to choose a backend; it makes that choice less rigid. Development teams can start with DuckDB or SQLite in the lab, then scale up to Postgres or MySQL for production and ultimately adopt a special-purpose cloud vector DB without having to re-architect the application.

Open source efforts like Vectorwrap are early examples of this approach, presenting a single Python API to Postgres, MySQL, DuckDB and SQLite. They demonstrate the power of abstraction to accelerate prototyping, reduce lock-in risk and support hybrid architectures employing numerous backends.

Why businesses should care

For leaders of data infrastructure and decision-makers for AI, abstraction offers three benefits:

Speed from prototype to production

Teams are able to prototype on lightweight local environments and scale without expensive rewrites.

Reduced vendor risk

Organizations can adopt new backends as they emerge without long migration projects by decoupling app code from specific databases.

Hybrid flexibility

Companies can mix transactional, analytical and specialized vector DBs under one architecture, all behind an aggregated interface.

The result is data layer agility, and that's more and more the difference between fast and slow companies.

A broader movement in open source

What's happening in the vector space is one example of a bigger trend: Open-source abstractions as critical infrastructure.
- In data formats: Apache Arrow
- In ML models: ONNX
- In orchestration: Kubernetes
- In AI APIs: Any-LLM and other such frameworks
These projects succeed, not by adding new capability, but by removing friction. They enable enterprises to move more quickly, hedge bets and evolve along with the ecosystem.

Vector DB adapters continue this legacy, transforming a high-speed, fragmented space into infrastructure that enterprises can truly depend on.

The future of vector DB portability

The landscape of vector DBs will not converge anytime soon. Instead, the number of options will grow, and every vendor will tune for different use cases, scale, latency, hybrid search, compliance or cloud platform integration.

Abstraction becomes strategy in this case. Companies adopting portable approaches will be capable of:
- Prototyping boldly
- Deploying in a flexible manner
- Scaling rapidly to new tech
It's possible we'll eventually see a "JDBC for vectors," a universal standard that codifies queries and operations across backends. Until then, open-source abstractions are laying the groundwork.

Conclusion

Enterprises adopting AI cannot afford to be slowed by database lock-in. As the vector ecosystem evolves, the winners will be those who treat abstraction as infrastructure, building against portable interfaces rather than binding themselves to any single backend.

The decades-long lesson of software engineering is simple: Standards and abstractions lead to adoption. For vector DBs, that revolution has already begun.

Mihir Ahuja is an AI/ML engineer and open-source contributor based in San Francisco.
Ekim 19, 2025
Codev lets enterprises avoid vibe coding hangovers with a team of agents that generate and document code

For many software developers using generative AI, vibe coding is a double-edged sword.

The process delivers rapid prototypes but often leaves a trail of brittle, undocumented code that creates significant technical debt.

A new open-source platform, Codev, addresses this by proposing a fundamental shift: treating the natural language conversation with an AI as part of the actual source code.

Codev is based on SP(IDE)R, a framework designed to turn vibe-coding conversations into structured, versioned, and auditable assets that become part of the code repository.

What is Codev?

At its core, Codev is a methodology that treats natural language context as an integral part of the development lifecycle as opposed to a disposable artifact as is the case with vanilla vibe coding.

According to co-founder Waleed Kadous, the goal is to invert the typical engineering workflow.

"A key principle of Codev is that documents like the specification are the actual code of the system," he told VentureBeat. "It's almost like natural language is compiled down into Typescript by our agents."

This approach avoids the common pitfall where documentation is created after the fact, if at all.

Its flagship protocol, SP(IDE)R, provides a lightweight but formal structure for building software. The process begins with Specify, where a human and multiple AI agents collaborate to turn a high-level request into concrete acceptance criteria. Next, in the Plan stage, an AI proposes a phased implementation, which is again reviewed.

For each phase, the AI enters an IDE loop: it Implements the code, Defends it against bugs and regression with comprehensive tests, and Evaluates the result against the specification. The final step is Review, where the team documents lessons learned to update and improve the SP(IDE)R protocol itself for future projects.

The framework’s key differentiator is its use of multiple agents and explicit human review at different stages. Kadous notes that each agent brings unique strengths to the review process.

"Gemini is extremely good at catching security issues," he said, citing a critical cross-site scripting (XSS) flaw and another bug that "would have shared an OpenAI API key with the client, which could cost thousands of dollars."

Meanwhile, "GPT-5 is very good at understanding how to simplify a design." This structured review, with a human providing final approval at each stage, prevents the kind of runaway automation that leads to flawed code.

The platform’s AI-native philosophy extends to its installation. There is no complex installer; instead, a user instructs their AI agent to apply the Codev GitHub repository to set up the project. The developers "dogfooded" their framework, using Codev to build Codev.

“The key point here is that natural language is executable now, with the agent being the interpreter,” Kadous said. “This is great because it means it's not a ‘blind’ integration of Codev, the agent gets to choose the best way to integrate it and can intelligently make decisions.”

Codev case study

To test the framework's effectiveness, its creators ran a direct comparison between vanilla vibe-coding and Codev. They gave Claude Opus 4.1 a request to build a modern web-based todo manager. The first attempt used a conversational, vibe-coding approach. The result was a plausible-looking demo. However, an automated analysis conducted by three independent AI agents found that it had implemented 0% of the required functionality, contained no tests, and lacked a database or API.

The second attempt used the same AI model and prompt but applied the SP(IDE)R protocol. This time, the AI produced a production-ready application with 32 source files, 100% of the specified functionality, five test suites, a SQLite database, and a complete RESTful API.

Throughout this process, the human developers reported they never directly edited a single line of source code. While this was a single experiment, Kadous estimates the impact is substantial.

"Subjectively, it feels like I'm about three times as productive with Codev as without," he says. The quality also speaks for itself. "I used LLMs as a judge, and one of them described the output like what a well-oiled engineering team would produce. That was exactly what I was aiming for."

While the process is powerful, it redefines the developer's role from a hands-on coder to a system architect and reviewer. According to Kadous, the initial spec and plan stages can each take between 45 minutes to two hours of focused collaboration.

This is in contrast to the impression given by many vibe-coding platforms, where a single prompt and a few minutes of processing gives you a fully functional and scalable application.

"All of the value I add is in the background knowledge I apply to the specs and plans," he explains. He emphasizes that the framework is designed to augment, not replace, experienced talent. "The people who will do the best… are senior engineers and above because they know the pitfalls… It just takes the senior engineer you already have and makes them much more productive."

A future of human and AI collaboration

Frameworks like Codev signal a shift where the primary creative act of software development moves from writing code to crafting precise, machine-readable specifications and plans. For enterprise teams, this means AI-generated code can become auditable, maintainable, and reliable. By capturing the entire development conversation in version control and enforcing it with CI, the process turns ephemeral chats into durable engineering assets.

Codev proposes a future where the AI acts not as a chaotic assistant, but as a disciplined collaborator in a structured, human-led workflow.

However, Kadous acknowledges this shift creates new challenges for the workforce. "Senior engineers that reject AI outright will be outpaced by senior engineers who embrace it," he predicts. He also expresses concern for junior developers who may not get the chance "to build their architectural chops," a skill that becomes even more critical when guiding AI.

This highlights a central challenge for the industry: ensuring that as AI elevates top performers, it also creates pathways to develop the next generation of talent.

Ekim 18, 2025
Developers can now add live Google Maps data to Gemini-powered AI app outputs
Google is adding a new feature for third-party developers building atop its Gemini AI models that rivals like OpenAI's ChatGPT, Anthropic's Claude, and the growing array of Chinese open source options are unlikely to get anytime soon: grounding with Google Maps.

This addition allows developers to connect Google's Gemini AI models' reasoning capabilities with live geospatial data from Google Maps, enabling applications to deliver detailed, location-relevant responses to user queries—such as business hours, reviews, or the atmosphere of a specific venue.

By tapping into data from over 250 million places, developers can now build more intelligent and responsive location-aware experiences.

This is particularly useful for applications where proximity, real-time availability, or location-specific personalization matter—such as local search, delivery services, real estate, and travel planning.

When the user’s location is known, developers can pass latitude and longitude into the request to enhance the response quality.

By tightly integrating real-time and historical Maps data into the Gemini API, Google enables applications to generate grounded, location-specific responses with factual accuracy and contextual depth that are uniquely possible through its mapping infrastructure.

Merging AI and Geospatial Intelligence

The new feature is accessible in Google AI Studio, where developers can try a live demo powered by the Gemini Live API. Models that support the grounding with Google Maps include:
- Gemini 2.5 Pro
- Gemini 2.5 Flash
- Gemini 2.5 Flash-Lite
- Gemini 2.0 Flash
In one demonstration, a user asked for Italian restaurant recommendations in Chicago.

The assistant, leveraging Maps data, retrieved top-rated options and clarified a misspelled restaurant name before locating the correct venue with accurate business details.

Developers can also retrieve a context token to embed a Google Maps widget in their app’s user interface. This interactive component displays photos, reviews, and other familiar content typically found in Google Maps.

Integration is handled via the generateContent method in the Gemini API, where developers include googleMaps as a tool. They can also enable a Maps widget by setting a parameter in the request. The widget, rendered using a returned context token, can provide a visual layer alongside the AI-generated text.

Use Cases Across Industries

The Maps grounding tool is designed to support a wide range of practical use cases:
- Itinerary generation: Travel apps can create detailed daily plans with routing, timing, and venue information.
- Personalized local recommendations: Real estate platforms can highlight listings near kid-friendly amenities like schools and parks.
- Detailed location queries: Applications can provide specific information, such as whether a cafe offers outdoor seating, using community reviews and Maps metadata.
Developers are encouraged to only enable the tool when geographic context is relevant, to optimize both performance and cost.

According to the developer documentation, pricing starts at $25 per 1,000 grounded prompts — a steep sum for those trafficking in numerous queries.

Combining Search and Maps for Enhanced Context

Developers can use Grounding with Google Maps alongside Grounding with Google Search in the same request.

While the Maps tool contributes factual data—like addresses, hours, and ratings—the Search tool adds broader context from web content, such as news or event listings.

For example, when asked about live music on Beale Street, the combined tools provide venue details from Maps and event times from Search.

According to Google, internal testing shows that using both tools together leads to significantly improved response quality.

Customization and Developer Flexibility

The experience is built for customization. Developers can tweak system prompts, choose from different Gemini models, and configure voice settings to tailor interactions.

The demo app in Google AI Studio is also remixable, enabling developers to test ideas, add features, and iterate on designs within a flexible development environment.

The API returns structured metadata—including source links, place IDs, and citation spans—that developers can use to build inline citations or verify the AI-generated outputs.

This supports transparency and enhances trust in user-facing applications. Google also requires that Maps-based sources be attributed clearly and linked back to the source using their URI.

Implementation Considerations for AI Builders

For technical teams integrating this capability, Google recommends:
- Passing user location context when known, for better results.
- Displaying Google Maps source links directly beneath the relevant content.
- Only enabling the tool when the query clearly involves geographic context.
- Monitoring latency and disabling grounding when performance is critical.
Grounding with Google Maps is currently available globally, though prohibited in several territories (including China, Iran, North Korea, and Cuba), and not permitted for emergency response use cases.

Availability and Access

Grounding with Google Maps is now generally available through the Gemini API.

With this release, Google continues to expand the capabilities of the Gemini API, empowering developers to build AI-driven applications that understand and respond to the world around them.
Ekim 18, 2025
Amazon and Chobani adopt Strella’s AI interviews for customer research as fast-growing startup raises $14M

One year after emerging from stealth, Strella has raised $14 million in Series A funding to expand its AI-powered customer research platform, the company announced Thursday. The round, led by Bessemer Venture Partners with participation from Decibel Partners, Bain Future Back Ventures, MVP Ventures and 645 Ventures, comes as enterprises increasingly turn to artificial intelligence to understand customers faster and more deeply than traditional methods allow.

The investment marks a sharp acceleration for the startup founded by Lydia Hylton and Priya Krishnan, two former consultants and product managers who watched companies struggle with a customer research process that could take eight weeks from start to finish. Since October, Strella has grown revenue tenfold, quadrupled its customer base to more than 40 paying enterprises, and tripled its average contract values by moving upmarket to serve Fortune 500 companies.

"Research tends to be bookended by two very strategic steps: first, we have a problem—what research should we do? And second, we've done the research—now what are we going to do with it?" said Hylton, Strella's CEO, in an exclusive interview with VentureBeat. "All the stuff in the middle tends to be execution and lower-skill work. We view Strella as doing that middle 90% of the work."

The platform now serves Amazon, Duolingo, Apollo GraphQL, and Chobani, collectively conducting thousands of AI-moderated interviews that deliver what the company claims is a 90% average time savings on manual research work. The company is approaching $1 million in revenue after beginning monetization only in January, with month-over-month growth of 50% and zero customer churn to date.

How AI-powered interviews compress eight-week research projects into days

Strella's technology addresses a workflow that has frustrated product teams, marketers, and designers for decades. Traditional customer research requires writing interview guides, recruiting participants, scheduling calls, conducting interviews, taking notes, synthesizing findings, and creating presentations — a process that consumes weeks of highly-skilled labor and often delays critical product decisions.

The platform compresses that timeline to days by using AI to moderate voice-based interviews that run like Zoom calls, but with an artificial intelligence agent asking questions, following up on interesting responses, and detecting when participants are being evasive or fraudulent. The system then synthesizes findings automatically, creating highlight reels and charts from unstructured qualitative data.

"It used to take eight weeks. Now you can do it in the span of a couple days," Hylton told VentureBeat. "The primary technology is through an AI-moderated interview. It's like being in a Zoom call with an AI instead of a human — it's completely free form and voice based."

Critically, the platform also supports human moderators joining the same calls, reflecting the founders' belief that humans won't disappear from the research process. "Human moderation won't go away, which is why we've supported human moderation from our Genesis," Hylton said. "Research tends to be bookended by two very strategic steps: we have a problem, what's the research that we should do? And we've done the research, now what are we going to do with it? All the stuff in the middle tends to be execution and lower skill work. We view Strella as doing that middle 90% of the work."

Why customers tell AI moderators the truth they won't share with humans

One of Strella's most surprising findings challenges assumptions about AI in qualitative research: participants appear more honest with AI moderators than with humans. The founders discovered this pattern repeatedly as customers ran head-to-head comparisons between traditional human-moderated studies and Strella's AI approach.

"If you're a designer and you get on a Zoom call with a customer and you say, 'Do you like my design?' they're always gonna say yes. They don't want to hurt your feelings," Hylton explained. "But it's not a problem at all for Strella. They would tell you exactly what they think about it, which is really valuable. It's very hard to get honest feedback."

Krishnan, Strella's COO, said companies initially worried about using AI and "eroding quality," but the platform has "actually found the opposite to be true. People are much more open and honest with an AI moderator, and so the level of insight that you get is much richer because people are giving their unfiltered feedback."

This dynamic has practical business implications. Brian Santiago, Senior Product Design Manager at Apollo GraphQL, said in a statement: "Before Strella, studies took weeks. Now we get insights in a day — sometimes in just a few hours. And because participants open up more with the AI moderator, the feedback is deeper and more honest."

The platform also addresses endemic fraud in online surveys, particularly when participants are compensated. Because Strella interviews happen on camera in real time, the AI moderator can detect when someone pauses suspiciously long — perhaps to consult ChatGPT — and flags them as potentially fraudulent. "We are fraud resistant," Hylton said, contrasting this with traditional surveys where fraud rates can be substantial.

Solving mobile app research with persistent screen sharing technology

A major focus of the Series A funding will be expanding Strella's recently-launched mobile application, which Krishnan identified as critical competitive differentiation. The mobile app enables persistent screen sharing during interviews — allowing researchers to watch users navigate mobile applications in real time while the AI moderator asks about their experience.

"We are the only player in the market that supports screen sharing on mobile," Hylton said. "You know, I want to understand what are the pain points with my app? Why do people not seem to be able to find the checkout flow? Well, in order to do that effectively, you'd like to see the user screen while they're doing an interview."

For consumer-facing companies where mobile represents the primary customer interface, this capability opens entirely new use cases. The founders noted that "several of our customers didn't do research before" but have now built research practices around Strella because the platform finally made mobile research accessible at scale.

The platform also supports embedding traditional survey question types directly into the conversational interview, approaching what Hylton called "feature parity with a survey" while maintaining the engagement advantages of a natural conversation. Strella interviews regularly run 60 to 90 minutes with nearly 100% completion rates—a duration that would see 60-70% drop-off in a traditional survey format.

How Strella differentiated in a market crowded with AI research startups

Strella enters a market that appears crowded at first glance, with established players like Qualtrics and a wave of AI-powered startups promising to transform customer research. The founders themselves initially pursued a different approach — synthetic respondents, or "digital twins" that simulate customer perspectives using large language models.

"We actually pivoted from that. That was our initial idea," Hylton revealed, referring to synthetic respondents. "People are very intrigued by that concept, but found in practice, no willingness to pay right now."

Recent research suggesting companies could use language models as digital twins for customer feedback has reignited interest in that approach. But Hylton remains skeptical: "The capabilities of the LLMs as they are today are not good enough, in my opinion, to justify a standalone company. Right now you could just ask ChatGPT, 'What would new users of Duolingo think about this ad copy?' You can do that. Adding the standalone idea of a synthetic panel is sort of just putting a wrapper on that."

Instead, Strella's bet is that the real value lies in collecting proprietary qualitative data at scale — building what could become "the system of truth for all qualitative insights" within enterprises, as Lindsey Li, Vice President at Bessemer Venture Partners, described it.

Li, who led the investment just one year after Strella emerged from stealth, said the firm was convinced by both the technology and the team. "Strella has built highly differentiated technology that enables a continuous interview rather than a survey," Li said. "We heard time and time again that customers loved this product experience relative to other offerings."

On the defensibility question that concerns many AI investors, Li emphasized product execution over patents: "We think the long game here will be won with a million small product decisions, all of which must be driven by deep empathy for customer pain and an understanding of how best to address their needs. Lydia and Priya exhibit that in spades."

The founders point to technical depth that's difficult to replicate. Most competitors started with adaptive surveys — text-based interfaces where users type responses and wait for the next question. Some have added voice, but typically as uploaded audio clips rather than free-flowing conversation.

"Our approach is fundamentally better, which is the fact that it is a free form conversation," Hylton said. "You never have to control anything. You're never typing, there's no buttons, there's no upload and wait for the next question. It's completely free form, and that has been an extraordinarily hard product to build. There's a tremendous amount of IP in the way that we prompt our moderator, the way that we run analysis."

The platform also improves with use, learning from each customer's research patterns to fine-tune future interview guides and questions. "Our product gets better for our customers as they continue to use us," Hylton said. All research accumulates in a central repository where teams can generate new insights by chatting with the data or creating visualizations from previously unstructured qualitative feedback.

Creating new research budgets instead of just automating existing ones

Perhaps more important than displacing existing research is expanding the total market. Krishnan said growth has been "fundamentally related to our product" creating new research that wouldn't have happened otherwise.

"We have expanded the use cases in which people would conduct research," Krishnan explained. "Several of our customers didn't do research before, have always wanted to do research, but didn't have a dedicated researcher or team at their company that was devoted to it, and have purchased Strella to kick off and enable their research practice. That's been really cool where we've seen this market just opening up."

This expansion comes as enterprises face mounting pressure to improve customer experience amid declining satisfaction scores. According to Forrester Research's 2024 Customer Experience Index, customer experience quality has declined for three consecutive years — an unprecedented trend. The report found that 39% of brands saw CX quality deteriorate, with declines across effectiveness, ease, and emotional connection.

Meanwhile, Deloitte's 2025 Technology, Media & Telecommunications Predictions report forecasts that 25% of enterprises using generative AI will deploy AI agents by 2025, growing to 50% by 2027. The report specifically highlighted AI's potential to enhance customer satisfaction by 15-20% while reducing cost to serve by 20-30% when properly implemented.

Gartner identified conversational user interfaces — the category Strella inhabits — as one of three technologies poised to transform customer service by 2028, noting that "customers increasingly expect to be able to interact with the applications they use in a natural way."

Against this backdrop, Li sees substantial room for growth. "UX Research is a sub-sector of the $140B+ global market-research industry," Li said. "This includes both the software layer historically (~$430M) and professional services spend on UX research, design, product strategy, etc. which is conservatively estimated to be ~$6.4B+ annually. As software in this vertical, led by Strella, becomes more powerful, we believe the TAM will continue to expand meaningfully."

Making customer feedback accessible across the enterprise, not just research teams

The founders describe their mission as "democratizing access to the customer" — making it possible for anyone in an organization to understand customer perspectives without waiting for dedicated research teams to complete months-long studies.

"Many, many, many positions in the organization would like to get customer feedback, but it's so hard right now," Hylton said. With Strella, she explained, someone can "log into Strella and through a chat, create any highlight reel that you want and actually see customers in their own words answering the question that you have based on the research that's already been done."

This video-first approach to research repositories changes organizational dynamics around customer feedback. "Then you can say, 'Okay, engineering team, we need to build this feature. And here's the customer actually saying it,'" Hylton continued. "'This is not me. This isn't politics. Here are seven customers saying they can't find the Checkout button.' The fact that we are a very video-based platform really allows us to do that quickly and painlessly."

The company has moved decisively upmarket, with contract values now typically in the five-figure range and "several six figure contracts" signed, according to Krishnan. The pricing strategy reflects a premium positioning: "Our product is very good, it's very premium. We're charging based on the value it provides to customers," Krishnan said, rather than competing on cost alone.

This approach appears to be working. The company reports 100% conversion from pilot programs to paid contracts and zero churn among its 40-45 customers, with month-over-month revenue growth of 50%.

The roadmap: Computer vision, agentic AI, and human-machine collaboration

The Series A funding will primarily support scaling product and go-to-market teams. "We're really confident that we have product-market fit," Hylton said. "And now the question is execution, and we want to hire a lot of really talented people to help us execute."

On the product roadmap, Hylton emphasized continued focus on the participant experience as the key to winning the market. "Everything else is downstream of a joyful participant experience," she said, including "the quality of insights, the amount you have to pay people to do the interviews, and the way that your customers feel about a company."

Near-term priorities include adding visual capabilities so the AI moderator can respond to facial expressions and other nonverbal cues, and building more sophisticated collaboration features between human researchers and AI moderators. "Maybe you want to listen while an AI moderator is running a call and you might want to be able to jump in with specific questions," Hylton said. "Or you want to run an interview yourself, but you want the moderator to be there as backup or to help you."

These features move toward what the industry calls "agentic AI" — systems that can act more autonomously while still collaborating with humans. The founders see this human-AI collaboration, rather than full automation, as the sustainable path forward.

"We believe that a lot of the really strategic work that companies do will continue to be human moderated," Hylton said. "And you can still do that through Strella and just use us for synthesis in those cases."

For Li and Bessemer, the bet is on founders who understand this nuance. "Lydia and Priya exhibit the exact archetype of founders we are excited to partner with for the long term — customer-obsessed, transparent, thoughtful, and singularly driven towards the home-run scenario," she said.

The company declined to disclose specific revenue figures or valuation. With the new funding, Strella has now raised $18 million total, including a $4 million seed round led by Decibel Partners announced in October.

As Strella scales, the founders remain focused on a vision where technology enhances rather than eliminates human judgment—where an engineering team doesn't just read a research report, but watches seven customers struggle to find the same button. Where a product manager can query months of accumulated interviews in seconds. Where companies don't choose between speed and depth, but get both.

"The interesting part of the business is actually collecting that proprietary dataset, collecting qualitative research at scale," Hylton said, describing what she sees as Strella's long-term moat. Not replacing the researcher, but making everyone in the company one.

Ekim 17, 2025
How Anthropic’s ‘Skills’ make Claude faster, cheaper, and more consistent for business workflows

Anthropic launched a new capability on Thursday that allows its Claude AI assistant to tap into specialized expertise on demand, marking the company's latest effort to make artificial intelligence more practical for enterprise workflows as it chases rival OpenAI in the intensifying competition over AI-powered software development.

The feature, called Skills, enables users to create folders containing instructions, code scripts, and reference materials that Claude can automatically load when relevant to a task. The system marks a fundamental shift in how organizations can customize AI assistants, moving beyond one-off prompts to reusable packages of domain expertise that work consistently across an entire company.

"Skills are based on our belief and vision that as model intelligence continues to improve, we'll continue moving towards general-purpose agents that often have access to their own filesystem and computing environment," said Mahesh Murag, a member of Anthropic's technical staff, in an exclusive interview with VentureBeat. "The agent is initially made aware only of the names and descriptions of each available skill and can choose to load more information about a particular skill when relevant to the task at hand."

The launch comes as Anthropic, valued at $183 billion after a recent $13 billion funding round, projects its annual revenue could nearly triple to as much as $26 billion in 2026, according to a recent Reuters report. The company is currently approaching a $7 billion annual revenue run rate, up from $5 billion in August, fueled largely by enterprise adoption of its AI coding tools — a market where it faces fierce competition from OpenAI's recently upgraded Codex platform.

How 'progressive disclosure' solves the context window problem

Skills differ fundamentally from existing approaches to customizing AI assistants, such as prompt engineering or retrieval-augmented generation (RAG), Murag explained. The architecture relies on what Anthropic calls "progressive disclosure" — Claude initially sees only skill names and brief descriptions, then autonomously decides which skills to load based on the task at hand, accessing only the specific files and information needed at that moment.

"Unlike RAG, this relies on simple tools that let Claude manage and read files from a filesystem," Murag told VentureBeat. "Skills can contain an unbounded amount of context to teach Claude how to complete a task or series of tasks. This is because Skills are based on the premise of an agent being able to autonomously and intelligently navigate a filesystem and execute code."

This approach allows organizations to bundle far more information than traditional context windows permit, while maintaining the speed and efficiency that enterprise users demand. A single skill can include step-by-step procedures, code templates, reference documents, brand guidelines, compliance checklists, and executable scripts — all organized in a folder structure that Claude navigates intelligently.

The system's composability provides another technical advantage. Multiple skills automatically stack together when needed for complex workflows. For instance, Claude might simultaneously invoke a company's brand guidelines skill, a financial reporting skill, and a presentation formatting skill to generate a quarterly investor deck — coordinating between all three without manual intervention.

What makes Skills different from OpenAI's Custom GPTs and Microsoft's Copilot

Anthropic is positioning Skills as distinct from competing offerings like OpenAI's Custom GPTs and Microsoft's Copilot Studio, though the features address similar enterprise needs around AI customization and consistency.

"Skills' combination of progressive disclosure, composability, and executable code bundling is unique in the market," Murag said. "While other platforms require developers to build custom scaffolding, Skills let anyone — technical or not — create specialized agents by organizing procedural knowledge into files."

The cross-platform portability also sets Skills apart. The same skill works identically across Claude.ai, Claude Code (Anthropic's AI coding environment), the company's API, and the Claude Agent SDK for building custom AI agents. Organizations can develop a skill once and deploy it everywhere their teams use Claude, a significant advantage for enterprises seeking consistency.

The feature supports any programming language compatible with the underlying container environment, and Anthropic provides sandboxing for security — though the company acknowledges that allowing AI to execute code requires users to carefully vet which skills they trust.

Early customers report 8x productivity gains on finance workflows

Early customer implementations reveal how organizations are applying Skills to automate complex knowledge work. At Japanese e-commerce giant Rakuten, the AI team is using Skills to transform finance operations that previously required manual coordination across multiple departments.

"Skills streamline our management accounting and finance workflows," said Yusuke Kaji, general manager of AI at Rakuten in a statement. "Claude processes multiple spreadsheets, catches critical anomalies, and generates reports using our procedures. What once took a day, we can now accomplish in an hour."

That's an 8x improvement in productivity for specific workflows — the kind of measurable return on investment that enterprises increasingly demand from AI implementations. Mike Krieger, Anthropic's chief product officer and Instagram co-founder, recently noted that companies have moved past "AI FOMO" to requiring concrete success metrics.

Design platform Canva plans to integrate Skills into its own AI agent workflows. "Canva plans to leverage Skills to customize agents and expand what they can do," said Anwar Haneef, general manager and head of ecosystem at Canva in a statement. "This unlocks new ways to bring Canva deeper into agentic workflows—helping teams capture their unique context and create stunning, high-quality designs effortlessly."

Cloud storage provider Box sees Skills as a way to make corporate content repositories more actionable. "Skills teaches Claude how to work with Box content," said Yashodha Bhavnani, head of AI at Box. "Users can transform stored files into PowerPoint presentations, Excel spreadsheets, and Word documents that follow their organization's standards—saving hours of effort."

The enterprise security question: Who controls which AI skills employees can use?

For enterprise IT departments, Skills raise important questions about governance and control—particularly since the feature allows AI to execute arbitrary code in sandboxed environments. Anthropic has built administrative controls that allow enterprise customers to manage access at the organizational level.

"Enterprise admins control access to the Skills capability via admin settings, where they can enable or disable access and monitor usage patterns," Murag said. "Once enabled at the organizational level, individual users still need to opt in."

That two-layer consent model — organizational enablement plus individual opt-in — reflects lessons learned from previous enterprise AI deployments where blanket rollouts created compliance concerns. However, Anthropic's governance tools appear more limited than some enterprise customers might expect. The company doesn't currently offer granular controls over which specific skills employees can use, or detailed audit trails of custom skill content.

Organizations concerned about data security should note that Skills require Claude's code execution environment, which runs in isolated containers. Anthropic advises users to "stick to trusted sources" when installing skills and provides security documentation, but the company acknowledges this is an inherently higher-risk capability than traditional AI interactions.

From API to no-code: How Anthropic is making Skills accessible to everyone

Anthropic is taking several approaches to make Skills accessible to users with varying technical sophistication. For non-technical users on Claude.ai, the company provides a "skill-creator" skill that interactively guides users through building new skills by asking questions about their workflow, then automatically generating the folder structure and documentation.

Developers working with Anthropic's API get programmatic control through a new /skills endpoint and can manage skill versions through the Claude Console web interface. The feature requires enabling the Code Execution Tool beta in API requests. For Claude Code users, skills can be installed via plugins from the anthropics/skills GitHub marketplace, and teams can share skills through version control systems.

"Skills are included in Max, Pro, Teams, and Enterprise plans at no additional cost," Murag confirmed. "API usage follows standard API pricing," meaning organizations pay only for the tokens consumed during skill execution, not for the skills themselves.

Anthropic provides several pre-built skills for common business tasks, including professional generation of Excel spreadsheets with formulas, PowerPoint presentations, Word documents, and fillable PDFs. These Anthropic-created skills will remain free.

Why the Skills launch matters in the AI coding wars with OpenAI

The Skills announcement arrives during a pivotal moment in Anthropic's competition with OpenAI, particularly around AI-assisted software development. Just one day before releasing Skills, Anthropic launched Claude Haiku 4.5, a smaller and cheaper model that nonetheless matches the coding performance of Claude Sonnet 4 — which was state-of-the-art when released just five months ago.

That rapid improvement curve reflects the breakneck pace of AI development, where today's frontier capabilities become tomorrow's commodity offerings. OpenAI has been pushing hard on coding tools as well, recently upgrading its Codex platform with GPT-5 and expanding GitHub Copilot's capabilities.

Anthropic's revenue trajectory — potentially reaching $26 billion in 2026 from an estimated $9 billion by year-end 2025 — suggests the company is successfully converting enterprise interest into paying customers. The timing also follows Salesforce's announcement this week that it's deepening AI partnerships with both OpenAI and Anthropic to power its Agentforce platform, signaling that enterprises are adopting a multi-vendor approach rather than standardizing on a single provider.

Skills addresses a real pain point: the "prompt engineering" problem where effective AI usage depends on individual employees crafting elaborate instructions for routine tasks, with no way to share that expertise across teams. Skills transforms implicit knowledge into explicit, shareable assets. For startups and developers, the feature could accelerate product development significantly — adding sophisticated document generation capabilities that previously required dedicated engineering teams and weeks of development.

The composability aspect hints at a future where organizations build libraries of specialized skills that can be mixed and matched for increasingly complex workflows. A pharmaceutical company might develop skills for regulatory compliance, clinical trial analysis, molecular modeling, and patient data privacy that work together seamlessly — creating a customized AI assistant with deep domain expertise across multiple specialties.

Anthropic indicates it's working on simplified skill creation workflows and enterprise-wide deployment capabilities to make it easier for organizations to distribute skills across large teams. As the feature rolls out to Anthropic's more than 300,000 business customers, the true test will be whether organizations find Skills substantively more useful than existing customization approaches.

For now, Skills offers Anthropic's clearest articulation yet of its vision for AI agents: not generalists that try to do everything reasonably well, but intelligent systems that know when to access specialized expertise and can coordinate multiple domains of knowledge to accomplish complex tasks. If that vision catches on, the question won't be whether your company uses AI — it will be whether your AI knows how your company actually works.

Ekim 17, 2025

Blog

Early creative benefits, but a divide remains

Creative benefits of AI PCs

From efficiency to fulfillment

The AI acceleration paradox

The hidden financial risks of AI

Why traditional financial planning can't handle AI

The strategic value of cost transparency

TBM: An enterprise framework for AI cost management

Parallel usage

Isolated environments

Deep tuning

More customization

Probabilistic systems need governance, not wishful thinking

The real-world costs of skipping onboarding

Treat AI agents like new hires

Feedback loops and performance reviews—forever

Why this is urgent now

A practical onboarding checklist

Why portability matters now

Abstraction as infrastructure

The adapter approach to vectors

Why businesses should care

Speed from prototype to production

Reduced vendor risk

Hybrid flexibility

A broader movement in open source

The future of vector DB portability

Conclusion

What is Codev?

Codev case study

A future of human and AI collaboration

Merging AI and Geospatial Intelligence

Use Cases Across Industries

Combining Search and Maps for Enhanced Context

Customization and Developer Flexibility

Implementation Considerations for AI Builders

Availability and Access

How AI-powered interviews compress eight-week research projects into days

Why customers tell AI moderators the truth they won't share with humans

Solving mobile app research with persistent screen sharing technology

How Strella differentiated in a market crowded with AI research startups

Creating new research budgets instead of just automating existing ones

Making customer feedback accessible across the enterprise, not just research teams

The roadmap: Computer vision, agentic AI, and human-machine collaboration

How 'progressive disclosure' solves the context window problem

What makes Skills different from OpenAI's Custom GPTs and Microsoft's Copilot

Early customers report 8x productivity gains on finance workflows

The enterprise security question: Who controls which AI skills employees can use?

From API to no-code: How Anthropic is making Skills accessible to everyone

Why the Skills launch matters in the AI coding wars with OpenAI