We’re in the midst of a global mental-health crisis. More than a billion people worldwide suffer from a mental-health condition, according to the World Health Organization. The prevalence of anxiety and depression is growing in many demographics, particularly young people, and suicide is claiming hundreds of thousands of lives globally each year. Given the clear…
Blog
-
AI Wrapped: The 14 AI terms you couldn’t avoid in 2025
If the past 12 months have taught us anything, it’s that the AI hype train is showing no signs of slowing. It’s hard to believe that at the beginning of the year, DeepSeek had yet to turn the entire industry on its head, Meta was better known for trying (and failing) to make the metaverse…
-
How I learned to stop worrying and love AI slop
Lately, everywhere I scroll, I keep seeing the same fish-eyed CCTV view: a grainy wide shot from the corner of a living room, a driveway at night, an empty grocery store. Then something impossible happens. JD Vance shows up at the doorstep in a crazy outfit. A car folds into itself like paper and drives…
-
How social media encourages the worst of AI boosterism
Demis Hassabis, CEO of Google DeepMind, summed it up in three words: “This is embarrassing.” Hassabis was replying on X to an overexcited post by Sébastien Bubeck, a research scientist at the rival firm OpenAI, announcing that two mathematicians had used OpenAI’s latest large language model, GPT-5, to find solutions to 10 unsolved problems in…
-
Take our quiz on the year in health and biotechnology
In just a couple of weeks, we’ll be bidding farewell to 2025. And what a year it has been! Artificial intelligence is being incorporated into more aspects of our lives, weight-loss drugs have expanded in scope, and there have been some real “omg” biotech stories from the fields of gene therapy, IVF, neurotech, and more. …
-
The Download: China’s dying EV batteries, and why AI doomers are doubling down
This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology. China figured out how to sell EVs. Now it has to bury their batteries. In the past decade, China has seen an EV boom, thanks in part to government support. Buying an electric…
-
Agent autonomy without guardrails is an SRE nightmare
João Freitas is GM and VP of engineering for AI and automation at PagerDuty
As AI use continues to evolve in large organizations, leaders are increasingly seeking the next development that will yield major ROI. The latest wave of this ongoing trend is the adoption of AI agents. However, as with any new technology, organizations must ensure they adopt AI agents in a responsible way that allows them to facilitate both speed and security.
More than half of organizations have already deployed AI agents to some extent, with more expecting to follow suit in the next two years. But many early adopters are now reevaluating their approach. Four-in-10 tech leaders regret not establishing a stronger governance foundation from the start, which suggests they adopted AI rapidly, but with margin to improve on policies, rules and best practices designed to ensure the responsible, ethical and legal development and use of AI.
As AI adoption accelerates, organizations must find the right balance between their exposure risk and the implementation of guardrails to ensure AI use is secure.
Where do AI agents create potential risks?
There are three principal areas of consideration for safer AI adoption.
The first is shadow AI, when employees use unauthorized AI tools without express permission, bypassing approved tools and processes. IT should create necessary processes for experimentation and innovation to introduce more efficient ways of working with AI. While shadow AI has existed as long as AI tools themselves, AI agent autonomy makes it easier for unsanctioned tools to operate outside the purview of IT, which can introduce fresh security risks.
Secondly, organizations must close gaps in AI ownership and accountability to prepare for incidents or processes gone wrong. The strength of AI agents lies in their autonomy. However, if agents act in unexpected ways, teams must be able to determine who is responsible for addressing any issues.
The third risk arises when there is a lack of explainability for actions AI agents have taken. AI agents are goal-oriented, but how they accomplish their goals can be unclear. AI agents must have explainable logic underlying their actions so that engineers can trace and, if needed, roll back actions that may cause issues with existing systems.
While none of these risks should delay adoption, they will help organizations better ensure their security.
The three guidelines for responsible AI agent adoption
Once organizations have identified the risks AI agents can pose, they must implement guidelines and guardrails to ensure safe usage. By following these three steps, organizations can minimize these risks.
1: Make human oversight the default
AI agency continues to evolve at a fast pace. However, we still need human oversight when AI agents are given the capacity to act, make decisions and pursue a goal that may impact key systems. A human should be in the loop by default, especially for business-critical use cases and systems. The teams that use AI must understand the actions it may take and where they may need to intervene. Start conservatively and, over time, increase the level of agency given to AI agents.
In conjunction, operations teams, engineers and security professionals must understand the role they play in supervising AI agents’ workflows. Each agent should be assigned a specific human owner for clearly defined oversight and accountability. Organizations must also allow any human to flag or override an AI agent’s behavior when an action has a negative outcome.
When considering tasks for AI agents, organizations should understand that, while traditional automation is good at handling repetitive, rule-based processes with structured data inputs, AI agents can handle much more complex tasks and adapt to new information in a more autonomous way. This makes them an appealing solution for all sorts of tasks. But as AI agents are deployed, organizations should control what actions the agents can take, particularly in the early stages of a project. Thus, teams working with AI agents should have approval paths in place for high-impact actions to ensure agent scope does not extend beyond expected use cases, minimizing risk to the wider system.
2: Bake in security
The introduction of new tools should not expose a system to fresh security risks.
Organizations should consider agentic platforms that comply with high security standards and are validated by enterprise-grade certifications such as SOC2, FedRAMP or equivalent. Further, AI agents should not be allowed free rein across an organization’s systems. At a minimum, the permissions and security scope of an AI agent must be aligned with the scope of the owner, and any tools added to the agent should not allow for extended permissions. Limiting AI agent access to a system based on their role will also ensure deployment runs smoothly. Keeping complete logs of every action taken by an AI agent can also help engineers understand what happened in the event of an incident and trace back the problem.
3: Make outputs explainable
AI use in an organization must never be a black box. The reasoning behind any action must be illustrated so that any engineer who tries to access it can understand the context the agent used for decision-making and access the traces that led to those actions.
Inputs and outputs for every action should be logged and accessible. This will help organizations establish a firm overview of the logic underlying an AI agent’s actions, providing significant value in the event anything goes wrong.
Security underscores AI agents’ success
AI agents offer a huge opportunity for organizations to accelerate and improve their existing processes. However, if they do not prioritize security and strong governance, they could expose themselves to new risks.
As AI agents become more common, organizations must ensure they have systems in place to measure how they perform and the ability to take action when they create problems.
Read more from our guest writers. Or, consider submitting a post of your own! See our guidelines here.
-
Hiring specialists made sense before AI — now generalists win
Tony Stoyanov is CTO and co-founder of EliseAI
In the 2010s, tech companies chased staff-level specialists: Backend engineers, data scientists, system architects. That model worked when technology evolved slowly. Specialists knew their craft, could deliver quickly and built careers on predictable foundations like cloud infrastructure or the latest JS framework
Then AI went mainstream.
The pace of change has exploded. New technologies appear and mature in less than a year. You can’t hire someone who has been building AI agents for five years, as the technology hasn’t existed for that long. The people thriving today aren’t those with the longest résumés; they’re the ones who learn fast, adapt fast and act without waiting for direction. Nowhere is this transformation more evident than in software engineering, which has likely experienced the most dramatic shift of all, evolving faster than almost any other field of work.
How AI Is rewriting the rules
AI has lowered the barrier to doing complex technical work, technical skills and it's also raised expectations for what counts as real expertise. McKinsey estimates that by 2030, up to 30% of U.S. work hours could be automated and 12 million workers may need to shift roles entirely. Technical depth still matters, but AI favors people who can figure things out as they go.
At my company, I see this every day. Engineers who never touched front-end code are now building UIs, while front-end developers are moving into back-end work. The technology keeps getting easier to use but the problems are harder because they span more disciplines.
In that kind of environment, being great at one thing isn’t enough. What matters is the ability to bridge engineering, product and operations to make good decisions quickly, even with imperfect information.
Despite all the excitement, only 1% of companies consider themselves truly mature in how they use AI. Many still rely on structures built for a slower era — layers of approval, rigid roles and an overreliance on specialists who can’t move outside their lane.
The traits of a strong generalist
A strong generalist has breadth without losing depth. They go deep in one or two domains but stay fluent across many. As David Epstein puts it in Range, “You have people walking around with all the knowledge of humanity on their phone, but they have no idea how to integrate it. We don’t train people in thinking or reasoning.” True expertise comes from connecting the dots, not just collecting information.
The best generalists share these traits:
-
Ownership: End-to-end accountability for outcomes, not just tasks.
-
First-principles thinking: Question assumptions, focus on the goal, and rebuild when needed.
-
Adaptability: Learn new domains quickly and move between them smoothly.
-
Agency: Act without waiting for approval and adjust as new information comes in.
-
Soft skills: Communicate clearly, align teams and keep customers’ needs in focus.
-
Range: Solve different kinds of problems and draw lessons across contexts.
I try to make accountability a priority for my teams. Everyone knows what they own, what success looks like and how it connects to the mission. Perfection isn’t the goal, forward movement is.
Embracing the shift
Focusing on adaptable builders changed everything. These are the people with the range and curiosity to use AI tools to learn quickly and execute confidently.
If you’re a builder who thrives in ambiguity, this is your time. The AI era rewards curiosity and initiative more than credentials. If you’re hiring, look ahead. The people who’ll move your company forward might not be the ones with the perfect résumé for the job. They’re the ones who can grow into what the company will need as it evolves.
The future belongs to generalists and to the companies that trust them.
Read more from our guest writers. Or, consider submitting a post of your own! See our guidelines here.
-
-
Anthropic launches enterprise ‘Agent Skills’ and opens the standard, challenging OpenAI in workplace AI
Anthropic said on Wednesday it would release its Agent Skills technology as an open standard, a strategic bet that sharing its approach to making AI assistants more capable will cement the company's position in the fast-evolving enterprise software market.
The San Francisco-based artificial intelligence company also unveiled organization-wide management tools for enterprise customers and a directory of partner-built skills from companies including Atlassian, Figma, Canva, Stripe, Notion, and Zapier.
The moves mark a significant expansion of a technology Anthropic first introduced in October, transforming what began as a niche developer feature into infrastructure that now appears poised to become an industry standard.
"We're launching Agent Skills as an independent open standard with a specification and reference SDK available at https://agentskills.io," Mahesh Murag, a product manager at Anthropic, said in an interview with VentureBeat. "Microsoft has already adopted Agent Skills within VS Code and GitHub; so have popular coding agents like Cursor, Goose, Amp, OpenCode, and more. We're in active conversations with others across the ecosystem."
Inside the technology that teaches AI assistants to do specialized work
Skills are, at their core, folders containing instructions, scripts, and resources that tell AI systems how to perform specific tasks consistently. Rather than requiring users to craft elaborate prompts each time they want an AI assistant to complete a specialized task, skills package that procedural knowledge into reusable modules.
The concept addresses a fundamental limitation of large language models: while they possess broad general knowledge, they often lack the specific procedural expertise needed for specialized professional work. A skill for creating PowerPoint presentations, for instance, might include preferred formatting conventions, slide structure guidelines, and quality standards — information the AI loads only when working on presentations.
Anthropic designed the system around what it calls "progressive disclosure." Each skill takes only a few dozen tokens when summarized in the AI's context window, with full details loading only when the task requires them. This architectural choice allows organizations to deploy extensive skill libraries without overwhelming the AI's working memory.
Fortune 500 companies are already using skills in legal, finance, and accounting
The new enterprise management features allow administrators on Anthropic's Team and Enterprise plans to provision skills centrally, controlling which workflows are available across their organizations while letting individual employees customize their experience.
"Enterprise customers are using skills in production across both coding workflows and business functions like legal, finance, accounting, and data science," Murag said. "The feedback has been positive because skills let them personalize Claude to how they actually work and get to high-quality output faster."
The community response has exceeded expectations, according to Murag: "Our skills repository already crossed 20k stars on GitHub, with tens of thousands of community-created and shared skills."
Atlassian, Figma, Stripe, and Zapier join Anthropic's skills directory at launch
Anthropic is launching with skills from ten partners, a roster that reads like a who's who of modern enterprise software. The presence of Atlassian, which makes Jira and Confluence, alongside design tools Figma and Canva, payment infrastructure company Stripe, and automation platform Zapier suggests Anthropic is positioning Skills as connective tissue between Claude and the applications businesses already use.
The business arrangements with these partners focus on ecosystem development rather than immediate revenue generation.
"Partners who build skills for the directory do so to enhance how Claude works with their platforms. It's a mutually beneficial ecosystem relationship similar to MCP connector partnerships," Murag explained. "There are no revenue-sharing arrangements at this time."
For vetting new partners, Anthropic is taking a measured approach. "We began with established partners and are developing more formal criteria as we expand," Murag said. "We want to create a valuable supply of skills for enterprises while helping partner products shine."
Notably, Anthropic is not charging extra for the capability. "Skills work across all Claude surfaces: Claude.ai, Claude Code, the Claude Agent SDK, and the API. They're included in Max, Pro, Team, and Enterprise plans at no additional cost. API usage follows standard API pricing," Murag said.
Why Anthropic is giving away its competitive advantage to OpenAI and Google
The decision to release Skills as an open standard is a calculated strategic choice. By making skills portable across AI platforms, Anthropic is betting that ecosystem growth will benefit the company more than proprietary lock-in would.
The strategy appears to be working. OpenAI has quietly adopted structurally identical architecture in both ChatGPT and its Codex CLI tool. Developer Elias Judin discovered the implementation earlier this month, finding directories containing skill files that mirror Anthropic's specification—the same file naming conventions, the same metadata format, the same directory organization.
This convergence suggests the industry has found a common answer to a vexing question: how do you make AI assistants consistently good at specialized work without expensive model fine-tuning?
The timing aligns with broader standardization efforts in the AI industry. Anthropic donated its Model Context Protocol to the Linux Foundation on December 9, and both Anthropic and OpenAI co-founded the Agentic AI Foundation alongside Block. Google, Microsoft, and Amazon Web Services joined as members. The foundation will steward multiple open specifications, and Skills fit naturally into this standardization push.
"We've also seen how complementary skills and MCP servers are," Murag noted. "MCP provides secure connectivity to external software and data, while skills provide the procedural knowledge for using those tools effectively. Partners who've invested in strong MCP integrations were a natural starting point."
The AI industry abandons specialized agents in favor of one assistant that learns everything
The Skills approach is a philosophical shift in how the AI industry thinks about making AI assistants more capable. The traditional approach involved building specialized agents for different use cases — a customer service agent, a coding agent, a research agent. Skills suggest a different model: one general-purpose agent equipped with a library of specialized capabilities.
"We used to think agents in different domains will look very different," Barry Zhang, an Anthropic researcher, said at an industry conference last month, according to a Business Insider report. "The agent underneath is actually more universal than we thought."
This insight has significant implications for enterprise software development. Rather than building and maintaining multiple specialized AI systems, organizations can invest in creating and curating skills that encode their institutional knowledge and best practices.
Anthropic's own internal research supports this approach. A study the company published in early December found that its engineers used Claude in 60% of their work, achieving a 50% self-reported productivity boost—a two to threefold increase from the prior year. Notably, 27% of Claude-assisted work consisted of tasks that would not have been done otherwise, including building internal tools, creating documentation, and addressing what employees called "papercuts" — small quality-of-life improvements that had been perpetually deprioritized.
Security risks and skill atrophy emerge as concerns for enterprise AI deployments
The Skills framework is not without potential complications. As AI systems become more capable through skills, questions arise about maintaining human expertise. Anthropic's internal research found that while skills enabled engineers to work across more domains—backend developers building user interfaces, researchers creating data visualizations—some employees worried about skill atrophy.
"When producing output is so easy and fast, it gets harder and harder to actually take the time to learn something," one Anthropic engineer said in the company's internal survey.
There are also security considerations. Skills provide Claude with new capabilities through instructions and code, which means malicious skills could theoretically introduce vulnerabilities. Anthropic recommends installing skills only from trusted sources and thoroughly auditing those from less-trusted origins.
The open standard approach introduces governance questions as well. While Anthropic has published the specification and launched a reference SDK, the long-term stewardship of the standard remains undefined. Whether it will fall under the Agentic AI Foundation or require its own governance structure is an open question.
Anthropic's real product may not be Claude—it may be the infrastructure everyone else builds on
The trajectory of Skills reveals something important about Anthropic's ambitions. Two months ago, the company introduced a feature that looked like a developer tool. Today, that feature has become a specification that Microsoft builds into VS Code, that OpenAI replicates in ChatGPT, and that enterprise software giants race to support.
The pattern echoes strategies that have reshaped the technology industry before. Companies from Red Hat to Google have discovered that open standards can be more valuable than proprietary technology — that the company defining how an industry works often captures more value than the company trying to own it outright.
For enterprise technology leaders evaluating AI investments, the message is straightforward: skills are becoming infrastructure. The expertise organizations encode into skills today will determine how effectively their AI assistants perform tomorrow, regardless of which model powers them.
The competitive battles between Anthropic, OpenAI, and Google will continue. But on the question of how to make AI assistants reliably good at specialized work, the industry has quietly converged on an answer — and it came from the company that gave it away.
-
Palona goes vertical, launching Vision, Workflow features: 4 key lessons for AI builders
Building an enterprise AI company on a "foundation of shifting sand" is the central challenge for founders today, according to the leadership at Palona AI.
Today, the Palo Alto-based startup—led by former Google and Meta engineering veterans—is making a decisive vertical push into the restaurant and hospitality space with today's launch of Palona Vision and Palona Workflow.
The new offerings transform the company’s multimodal agent suite into a real-time operating system for restaurant operations — spanning cameras, calls, conversations, and coordinated task execution.
The news marks a strategic pivot from the company’s debut in early 2025, when it first emerged with $10 million in seed funding to build emotionally intelligent sales agents for broad direct-to-consumer enterprises.
Now, by narrowing its focus to a "multimodal native" approach for restaurants, Palona is providing a blueprint for AI builders on how to move beyond "thin wrappers" to build deep systems that solve high-stakes physical world problems.
“You’re building a company on top of a foundation that is sand—not quicksand, but shifting sand,” said co-founder and CTO Tim Howes, referring to the instability of today’s LLM ecosystem. “So we built an orchestration layer that lets us swap models on performance, fluency, and cost.”
VentureBeat spoke with Howes and co-founder and CEO Maria Zhang in person recently at — where else? — a restaurant in NYC about the technical challenges and hard lessons learned from their launch, growth, and pivot.
The New Offering: Vision and Workflow as a ‘Digital GM’
For the end user—the restaurant owner or operator—Palona’s latest release is designed to function as an automated "best operations manager" that never sleeps.
Palona Vision uses in-store security cameras to analyze operational signals — such as queue lengths, table turnover, prep bottlenecks, and cleanliness — without requiring any new hardware.
It monitors front-of-house metrics like queue lengths, table turns, and cleanliness, while simultaneously identifying back-of-house issues like prep slowdowns or station setup errors.
Palona Workflow complements this by automating multi-step operational processes. This includes managing catering orders, opening and closing checklists, and food prep fulfillment. By correlating video signals from Vision with Point-of-Sale (POS) data and staffing levels, Workflow ensures consistent execution across multiple locations.
“Palona Vision is like giving every location a digital GM,” said Shaz Khan, founder of Tono Pizzeria + Cheesesteaks, in a press release provided to VentureBeat. “It flags issues before they escalate and saves me hours every week.”
Going Vertical: Lessons in Domain Expertise
Palona’s journey began with a star-studded roster. CEO Zhang previously served as VP of Engineering at Google and CTO of Tinder, while Co-founder Howes is the co-inventor of LDAP and a former Netscape CTO.
Despite this pedigree, the team’s first year was a lesson in the necessity of focus.
Initially, Palona served fashion and electronics brands, creating "wizard" and "surfer dude" personalities to handle sales. However, the team quickly realized that the restaurant industry presented a unique, trillion-dollar opportunity that was "surprisingly recession-proof" but "gobsmacked" by operational inefficiency.
"Advice to startup founders: don't go multi-industry," Zhang warned.
By verticalizing, Palona moved from being a "thin" chat layer to building a "multi-sensory information pipeline" that processes vision, voice, and text in tandem.
That clarity of focus opened access to proprietary training data (like prep playbooks and call transcripts) while avoiding generic data scraping.
1. Building on ‘Shifting Sand’
To accommodate the reality of enterprise AI deployments in 2025 — with new, improved models coming out on a nearly weekly basis — Palona developed a patent-pending orchestration layer.
Rather than being "bundled" with a single provider like OpenAI or Google, Palona’s architecture allows them to swap models on a dime based on performance and cost.
They use a mix of proprietary and open-source models, including Gemini for computer vision benchmarks and specific language models for Spanish or Chinese fluency.
For builders, the message is clear: Never let your product's core value be a single-vendor dependency.
2. From Words to ‘World Models’
The launch of Palona Vision represents a shift from understanding words to understanding the physical reality of a kitchen.
While many developers struggle to stitch separate APIs together, Palona’s new vision model transforms existing in-store cameras into operational assistants.
The system identifies "cause and effect" in real-time—recognizing if a pizza is undercooked by its "pale beige" color or alerting a manager if a display case is empty.
"In words, physics don't matter," Zhang explained. "But in reality, I drop the phone, it always goes down… we want to really figure out what's going on in this world of restaurants".
3. The ‘Muffin’ Solution: Custom Memory Architecture
One of the most significant technical hurdles Palona faced was memory management. In a restaurant context, memory is the difference between a frustrating interaction and a "magical" one where the agent remembers a diner’s "usual" order.
The team initially utilized an unspecified open-source tool, but found it produced errors 30% of the time. "I think advisory developers always turn off memory [on consumer AI products], because that will guarantee to mess everything up," Zhang cautioned.
To solve this, Palona built Muffin, a proprietary memory management system named as a nod to web "cookies". Unlike standard vector-based approaches that struggle with structured data, Muffin is architected to handle four distinct layers:
-
Structured Data: Stable facts like delivery addresses or allergy information.
-
Slow-changing Dimensions: Loyalty preferences and favorite items.
-
Transient and Seasonal Memories: Adapting to shifts like preferring cold drinks in July versus hot cocoa in winter.
-
Regional Context: Defaults like time zones or language preferences.
The lesson for builders: If the best available tool isn't good enough for your specific vertical, you must be willing to build your own.
4. Reliability through ‘GRACE’
In a kitchen, an AI error isn't just a typo; it’s a wasted order or a safety risk. A recent incident at Stefanina’s Pizzeria in Missouri, where an AI hallucinated fake deals during a dinner rush, highlights how quickly brand trust can evaporate when safeguards are absent.
To prevent such chaos, Palona’s engineers follow its internal GRACE framework:
-
Guardrails: Hard limits on agent behavior to prevent unapproved promotions.
-
Red Teaming: Proactive attempts to "break" the AI and identify potential hallucination triggers.
-
App Sec: Lock down APIs and third-party integrations with TLS, tokenization, and attack prevention systems.
-
Compliance: Grounding every response in verified, vetted menu data to ensure accuracy.
-
Escalation: Routing complex interactions to a human manager before a guest receives misinformation.
This reliability is verified through massive simulation. "We simulated a million ways to order pizza," Zhang said, using one AI to act as a customer and another to take the order, measuring accuracy to eliminate hallucinations.
The Bottom Line
With the launch of Vision and Workflow, Palona is betting that the future of enterprise AI isn't in broad assistants, but in specialized "operating systems" that can see, hear, and think within a specific domain.
In contrast to general-purpose AI agents, Palona’s system is designed to execute restaurant workflows, not just respond to queries — it's capable of remembering customers, hearing them order their "usual," and monitoring the restaurant operations to ensure they deliver that customer the food according to their internal processes and guidelines, flagging whenever something goes wrong or crucially, is about to go wrong.
For Zhang, the goal is to let human operators focus on their craft: "If you've got that delicious food nailed… we’ll tell you what to do."
-