Or try one of the following: 詹姆斯.com, adult swim, Afterdawn, Ajaxian, Andy Budd, Ask a Ninja, AtomEnabled.org, BBC News, BBC Arabic, BBC China, BBC Russia, Brent Simmons, Channel Frederator, CNN, Digg, Diggnation, Flickr, Google News, Google Video, Harvard Law, Hebrew Language, InfoWorld, iTunes, Japanese Language, Korean Language, mir.aculo.us, Movie Trailers, Newspond, Nick Bradbury, OK/Cancel, OS News, Phil Ringnalda, Photoshop Videocast, reddit, Romanian Language, Russian Language, Ryan Parman, Traditional Chinese Language, Technorati, Tim Bray, TUAW, TVgasm, UNEASYsilence, Web 2.0 Show, Windows Vista Blog, XKCD, Yahoo! News, You Tube, Zeldman
Microsoft lets shopping bots loose in a sandbox | InfoWorld
Technology insight for the enterpriseMicrosoft lets shopping bots loose in a sandbox 7 Nov 2025, 8:06 pm
Do you think it’s time to turn an AI agent loose to do your procurement for you? As that could be a potentially expensive experiment to conduct in the real world, Microsoft is attempting to determine whether agent-to-agent ecommerce will really work, without the risk of using it in a live environment.
Earlier this week, a team of its researchers launched the Magentic Marketplace, an initiative they described as an “an open source simulation environment for exploring the numerous possibilities of agentic markets and their societal implications at scale.” It manages capabilities such as maintaining catalogs of available goods and services, implementing discovery algorithms, facilitating agent-to-agent communication, and handling simulated payments through a centralized transaction layer.
The 23-person research team wrote in a blog detailing the project that it provides “a foundation for studying these markets and guiding them toward outcomes that benefit everyone, which matters because most AI agent research focuses on isolated scenarios — a single agent completing a task or two agents negotiating a simple transaction.”
But real markets, they said, involve a large number of agents simultaneously searching, communicating, and transacting, creating complex dynamics that can’t be understood by studying agents in isolation.
Capturing this complexity is essential, because real-world deployments raise critical questions about consumer welfare, market efficiency, fairness, manipulation resistance, and bias — questions that can’t be safely answered in production environments.
They noted that even state-of-the-art models can show “notable vulnerabilities and biases in marketplace environments,” and that, in the simulations, agents “struggled with too many options, were susceptible to manipulation tactics, and showed systemic biases that created unfair advantages.”
Furthermore, they concluded that a simulation environment is crucial in helping organizations understand the interplay between market components and agents before deploying them at scale.
In their full technical paper, the researchers also detailed significant behavioral variations across agent models, which, they said, included “differential abilities to process noisy search results and varying susceptibility to manipulation tactics, with performance gaps widening as market complexity increases,” adding, “these findings underscore the importance of systematic evaluation in multi-agent economic settings. Proprietary versus open source models work differently.”
Bias and misinformation an issue
Describing Magentic Marketplace as “very interesting research,” Lian Jye Su, chief analyst at Omdia, said that despite recent advancements, foundation models still have many weaknesses, including bias and misinformation.
Thus, he said, “any e-commerce operators that wish to rely on AI agents for tasks such as procurement and recommendations need to ensure the outputs are free of these weaknesses. At the moment, there are a few approaches to achieve this goal. Guardrails and filters will enable AI agents to generate outputs that are targeted and balanced, in line with rules and requirements.”
Many enterprises, said Su, “also apply context engineering to ground AI agents by creating a dynamic system that supplies the right context, such as relevant data, tools, and memory. With these tools in place, an AI agent can be trained to behave more similarly to a human employee and align the organizational interests.”
Similarly, he said, “we can therefore apply the same philosophy to the adoption of AI agents in the enterprise sector in general. AI agents should never be allowed to behave fully autonomously without sufficient check and balance, and in critical cases, human-in-the-loop.”
Thomas Randall, research lead at Info-Tech Research Group, noted, “The key finding was that when agents have clear, structured information (like accurate product data or transparent listings), they make much better decisions.” But the findings, he said, also revealed that these agents can be easily manipulated (for example, by misleading product descriptions or hidden prompts) and that giving agents too many choices can actually make their performance worse.
That means, he said, “the quality of information and the design of the marketplace strongly affect how well these automated systems behave. Ultimately, it’s unclear what massive value-add organizations may get if they let autonomous agents take over buying and selling.”
Agentic buying ‘a broad process’
Jason Anderson, vice president and principal analyst at Moor Insights & Strategy, said the areas the researchers looked into “are well scoped, as there are many different ways to buy and sell things. But, instead of attempting to execute commerce scenarios, the team kept it pretty straightforward to more deeply understand and test agent behavior versus what humans tend to assume naturally.”
For example, he said, “[humans] tend to narrow our selection criteria quickly to two or three options, since it’s tough for people to compare a broad matrix of requirements across many potential solutions, and it turns out that model performance also goes down when there are more choices as well. So, in that way there is some similarity between humans and agents.”
Also, Anderson said, “by testing bias and manipulation, we can see other patterns such as how some models have a bias toward picking the first option that met the user’s needs rather than examining all the options and choosing the best one. These types of observations will invariably end up helping models and agents improve over time.”
He also applauded the fact that Microsoft is open sourcing the data and simulation environment. “There are so many differences in how products and solutions are selected, negotiated, and bought from B2B versus B2C, Premium versus Commodities, cultural differences and the like,” he said. “An open sourcing of this tool will be valuable in terms of how behavior can be tested and shared, all of which will lead to a future where we can trust AI to transact.”
One thing this blog made clear, he noted, “is that agentic buying should be seen as a broad process and not just about executing the transaction; there is discovery, selection, comparison, negotiation, and so forth, and we are already seeing AI and agents being used in the process.”
However, he observed, “I think we have seen more effort from agents on the sell side of the process. For instance, Amazon can help someone discover products with its AI. Salesforce discussed how its Agentforce Sales now enables agents to help customers learn more about an offering. If [they] click on a promotion and begin to ask questions, the agent can them help them through a decision-making process.”
Caution urged
On the buy side, he said, “we are not at the agent stage quite yet, but I am very sure that AI and chatbots are playing a role in commerce already. For instance, I am sure that procurement teams out there are already using chat tools to help winnow down vendors before issuing RFIs or RFPs. And probably using that same tool to write the RFP. On the consumer side, it is very much the same, as comparison shopping is a use case highlighted by agentic browsers like Comet.”
Anderson said that he would also “urge some degree of caution for large procurement organizations to retool just yet. The learnings so far suggest that we still have a lot to learn before we see a reduction of humans in the loop, and if agents were to be used, they would need to be very tightly scoped and a good set of rules between buyer and seller be negotiated, since checking ‘my agent went rogue’ is not on the pick list for returning your order (yet).”
Randall added that for e-commerce operators leaning into this, it is “imperative to present data in consistent, machine-readable formats and be transparent about prices, shipping, and returns. It also means protecting systems from malicious inputs, like text that could trick an AI buyer into making bad decisions —the liabilities in this area are not well-defined, leading to legal headaches and complexities if organizations question what their agent bought.”
Businesses, he said, should expect a future where some customers are bots, and plan policies and protections, accordingly, including authentication for legitimate agents and rules to limit abuse.
In addition, said Randall, “many companies do not have the governance in place to move forward with agentic AI. Allowing AI to act autonomously raises new governance challenges: how to ensure accountability, compliance, and safety when decisions are made by machines rather than people — especially if those decisions cannot be effectively tracked.”
Sharing the sandbox
For those who’d like to explore further, Microsoft has made Magentic Marketplace available as an open source environment for exploring agentic market dynamics, with code, datasets, and experiment templates available on GitHub and Azure AI Foundry Labs.
This article originally appeared on Computerworld.
Kong Insomnia 12 bolsters AI, MCP tools 7 Nov 2025, 3:03 pm
Kong has released Insomnia 12, an update to the company’s open-source API development platform. The update is intended to accelerate API and Model Context Protocol (MCP) server development with AI-powered collaboration and testing.
Unveiled November 4 and generally available now, Insomnia 12 enables users to build, test, and deploy faster with native MCP clients, AI mock servers, and AI-powered commit suggestions, Kong said. The latest version of Insomnia addresses challenges in building MCP servers pertaining to how to validate and test what is being built quickly and reliably without complex steps. With Insomnia 12, the platform’s test-iterate-debug workflow is extended to AI-native development, while AI-driven features are introduced that reduce manual overhead and accelerate development, according to Kong.
Key capabilities cited for Insomnia 12 include:
- Native MCP clients enable testing and debugging of MCP servers with the same test-iterate-debug workflow Kong provides for APIs. Users can connect directly to servers, manually invoke any tool, prompt, or resource with custom parameters, and inspect protocol-level and authentication messages and responses.
- AI mock generation allows users to create mock servers by describing their requirements in natural language, providing a URL, JSON sample, or OpenAPI spec.
- AI-powered commits automatically generate descriptive commit messages and logical file groupings by analyzing diffs and history.
- Users can choose between cloud-based large language model (LLM) providers or local LLMs, giving teams control over where code and data reside while balancing performance and privacy needs.
- Insomnia 12 makes it easier for teams to collaborate on API and MCP development. Git Sync provides seamless version control and cross-machine collaboration, while enterprise users can instantly trial advanced security, governance, and compliance features such as SCIM (System for Cross-domain Identity Management), SSO (Single Sign On), and RBAC (Role-based Access Control).
AWS launches ‘Capabilities by Region’ to simplify planning for cloud deployments 7 Nov 2025, 3:53 am
AWS is finally making planning for cloud deployments less of a guessing game for enterprise teams.
The cloud service provider has launched a new planning tool named Capabilities by Region, which provides visibility into the availability of services, tools, and features, including AWS CloudFormation resources, across its global regions.
This visibility is critical to enterprises for planning and successfully deploying workloads in the cloud, and without it could lead to fallouts ranging from spiralling operational expenses and service outages to compliance breaches, analysts say.
“Enterprises struggle with regional service parity. They often discover gaps late in deployment, which causes delays and costly rework. This capability with authoritative, forward-looking visibility into service availability will help address such issues,” Charlie Dai, vice president and principal analyst at Forrester, said.
Dai was referring to the tool’s capability that shows whether a particular service or feature is planned for a region or not.
AWS’s ability to showcase planned service and feature expansions is a key differentiator from rival offerings, said Pareekh Jain, principal analyst with Pareekh Consulting.
While Microsoft Azure’s Product Availability by Region portal lists services by geography, it lacks forward-looking timelines and the unified API comparability that AWS’s Capabilities by Region delivers, Jain noted.
Region Picker, which is a similar offering from Google, too, falls short on granular, future-facing service or API roadmaps and focuses on passing information that would help enterprises optimize for price, carbon footprint, and latency.
According to Jain, Capabilities by Region can help enterprises avoid overspending in the cloud, which is a widespread and growing concern.
Industry estimates show that nearly 30% of most cloud budgets are wasted due to underutilized resources or poor visibility into resources, Jain added.
Unlike AWS CloudFormation and the AWS Cost Explorer, which are other tools related to planning and management of cloud deployments and are accessible via the AWS Management Console, AWS’s Capabilities by Region tool can be accessed via its Builder Center — an AWS products and services-related community portal targeted at cloud architects and developers.
Analysts say the new tool has been deliberately placed outside the AWS Management Console to avoid disruptions to live deployments.
“It is safer for non-admins/partners, and avoids touching live environments. Keeping it outside the Console lowers barriers to entry as no AWS account is needed,” Jain said. AWS is also making all the data inside Capabilities by Region accessible through the AWS Knowledge MCP Server, which will provide developers an avenue to automate cloud planning or get recommendations for cloud deployments via generative AI assistants.
NVIDIA’s GTC 2025 A glimpse into our AI-powered future 7 Nov 2025, 1:03 am
I just got back from NVIDIA’s GTC event in Washington, DC (Oct 26-29), and my head is still spinning. If you’re trying to understand where the tech world is headed, this was the place to be. There weren’t a ton of flashy new product reveals, but what I saw felt even more significant. It was a masterclass in strategy, showing how NVIDIA is cementing itself not just as a chip company, but as the bedrock of the entire modern economy.
The energy was different this year, especially with Jensen Huang making his first keynote appearance at the DC edition. The message was clear: the AI revolution is here, it’s massive, and it’s being built right here in America.
Let me break down what stood out to me.

Raul Leite
The staggering scale of it all
The most mind-boggling number tossed out was $500 billion. That’s the cumulative revenue Jensen expects from their Blackwell and next-gen Rubin platforms through next year. Let that sink in for a moment. He mentioned they’re planning on shipping about 20 million of these GPUs by the end of 2026. What really put it in perspective for me was his comment that this level of financial foresight is unheard of in tech. And get this, that colossal figure doesn’t even include the Chinese market due to export controls. This is just a demand from “the West.” It’s a clear signal that the hunger for AI compute is nowhere near satisfied.
They’re also sticking to a relentless annual release cadence. While Blackwell Ultra is already out the door, Rubin is already on the horizon for next year. The scale is moving from single servers to entire racks, like the Vera Rubin NVL144, which essentially crams 144 powerful GPUs into one cohesive unit. To keep all that data flowing, they also unveiled the BlueField-4 DPU, which is a monster of a data processor. The infrastructure being built today is almost hard to comprehend.

Raul Leite
The partnerships defining the next decade
This event was less about a solo act and more about a symphony of alliances. NVIDIA is planting its flag everywhere.
- The US government is all-in: The collaboration with Oracle and HPE to build seven new AI supercomputers for the Department of Energy is a huge deal. The crown jewel is “Solstice,” which will be a 100,000-GPU Blackwell beast. This isn’t just for science; it’s a direct investment in national security and technological sovereignty.
- A $1 billion bet on 6G: The telecom world is next. NVIDIA is investing a cool billion in Nokia to fundamentally reshape cellular networks for the 6G future. They’re working with T-Mobile to start trials next year, which tells me this isn’t just a distant dream — it’s a concrete plan.
- Enterprise AI gets real: I saw how companies like CrowdStrike and Palantir are building deeply integrated AI agents for everything from cybersecurity to managing global supply chains. And in a “wow” moment, Uber announced plans for a 100,000-strong fleet of robotaxis using NVIDIA’s platform, starting in 2027. This stuff is moving from lab to reality, fast.

Raul Leite
Invented in America. Built in America
A powerful, recurring theme was the rebirth of American manufacturing. Jensen was very direct, praising the pro-energy and pro-manufacturing focus of the current administration. He argued that to win in tech, we need the energy and the industrial base to support it.
He then laid out a concrete US supply chain:
- Blackwell chips will be fabricated at TSMC’s new plant in Arizona.
- Assembly will happen at a Foxconn facility in Texas.
- The high-bandwidth memory chips will come from a factory in Indiana.
This is a monumental shift from the old model. They’re also using their Omniverse technology to help partners design the hyper-efficient, robotic factories of the future. It feels like a true industrial rebirth, powered by AI.

Raul Leite
A quantum computer
Where my world fits in: The open-source foundation
As someone deeply invested in the open-source ecosystem, I was particularly focused on Red Hat’s role at NVIDIA GTC this year. Seeing our presence there really reinforced something I’ve believed for a long time: All this powerful hardware needs an equally powerful, flexible, and secure software foundation. Our message was simple and reinforced as usual: “We’re better together.” And it couldn’t be truer. There’s a greater than 95% chance that our AI solutions will run on NVIDIA hardware, so our mission is to make that partnership seamless and secure.
As one of the sponsors at GTC DC 2025, Red Hat showcased its commitment to the AI ecosystem and positioned itself as a leader in the modern AI and computing infrastructure space. Our strategy centers around the message “Red Hat & NVIDIA: Better Together” focusing on complementing NVIDIA’s innovation and accelerating adoption by providing a faster, more reliable path to production.
Our goals for the event were clear: increase awareness of our AI strategy, strengthen relationships with NVIDIA customers and partners, and demonstrate how open source can be the foundation for this new wave of intelligent infrastructure. It’s evident that for the new American industrial base to thrive, it needs a robust, open, and certified software platform. That’s the role we’re committed to playing, ensuring that from government supercomputers to the factory floor, the foundation is solid, secure, and ready for what’s next.

Raul Leite
During the GTC week, Red Hat announced two key collaborations with NVIDIA aimed at making AI development simpler and more secure. First, the NVIDIA CUDA Toolkit is now available directly through Red Hat platforms RHEL, OpenShift and Red Hat AI, giving developers a single, trusted source for essential GPU tools. Second, Red Hat introduced the STIG-hardened Universal Base Image (UBI-STIG), which NVIDIA is using to build a government-ready GPU Operator, helping agencies accelerate secure AI and ML deployments.
Walking out of the convention center, one thing was unmistakable: We’re no longer just talking about AI, we’re building it. And the scale and speed of what’s happening are unlike anything I’ve seen before. It’s going to be a fascinating few years ahead.
This article is published as part of the Foundry Expert Contributor Network.
Want to join?
More Django developers turning to AI – report 7 Nov 2025, 1:00 am
AI is becoming an important learning resource for users of Django, the well-established Python web framework. The recently published State of Django 2025 report notes that 38% of Django Developers Survey respondents said they were using AI tools to educate themselves on Django.
For Django development, 69% reported using ChatGPT, while 34% said they were using GitHub Copilot, 15% using Anthropic Claude, and 9% using JetBrains AI Assistant. The most popular tasks for developers using AI assistance were autocomplete (56%), generating code (51%), and writing boilerplate code (44%). AI trailed documentation on Djangoproject.com (79%) and Stack Overflow (39%) for learning about Django, though greater rates of AI adoption are anticipated in next year’s survey results, said the report.
Published October 27, the report features insights from more than 4,600 Django developers surveyed worldwide, and was done through a partnership between the Django Software Foundation and JetBrains, makers of the PyCharm Python IDE.
The report also covered JavaScript frameworks that are compatible with Django. HTMX and Alpine.js were cited as the fastest growing JavaScript frameworks used with Django, although HTMX leverages HTML. Meanwhile, React and jQuery, still the two most-popular JavaScript frameworks used with Django, continue to decline in overall usage. HTMX has grown from 5% in 2021 to 24%, while Alpine.js has grown from 3% to 14% usage, according to the report. Meanwhile, React and jQuery have consistently declined from 37% in 2021 to 32% for React and 26% for jQuery in 2025.
Elsewhere in the report:
- Type hints had strong support, with 63% of developers already using type hints and another 17% planning to adopt them, resulting in an 80% overall rate.
- PostgreSQL leads the field in back-end database usage. Of respondents 76% reported using PostgreSQL, followed by SQLite at 42%, MySQL at 27%, and MariaDB at 9%. These percentages have remained consistent over the past four years.
- Django developers are seasoned; 77% have three or more years of professional experience, with 82% using Django professionally.
- Django REST Framework is the most-popular third-party Django package, used by 49% of respondents.
We can’t ignore cloud governance anymore 7 Nov 2025, 1:00 am
Recent developments in enterprise cloud computing reveal a concerning lack of attention to cloud governance, despite enterprises facing significant risks and potential losses due to outages, inefficiencies, and non-compliance. As enterprises migrate from traditional infrastructures to the cloud, they often do so without a clear strategy to mitigate risk, or they fail to set up an ecosystem that fosters innovation and accountability. That’s why governance has emerged as the most important topic in cloud computing today, and why I, alongside my co-author Meredith Stein, decided to address it in our new book, Unlocking the Power of the Cloud: Governance, Artificial Intelligence, Risk Management, Value.
The book proposes a framework for enterprises to think differently about how they govern their operations in a cloud-forward world. Governance in the cloud is the backbone of any sustainable, scalable, and secure cloud strategy. With decades of experience in cloud computing, artificial intelligence, and risk management, Meredith and I felt this was not only a timely subject but a necessary one. Enterprises are innovating rapidly, but many do so without considering the potential long-term consequences of ungoverned cloud environments. We’ve seen those consequences firsthand through inefficiencies, lost revenue, reputational damage, and even catastrophic outages.
Governance is critical to cloud ecosystems
Cloud computing has fundamentally shifted how businesses operate. Unlike legacy systems where infrastructure and operations were controlled on premises, the cloud introduces new operational complexities. It democratizes access to technologies such as artificial intelligence, machine learning, and advanced data analytics but also brings unprecedented risks. A single minor decision, from misconfigured security protocols to inadequate compliance measures, can trigger a cascade of failures across an enterprise.
Despite these risks, many organizations are still treating cloud governance as an afterthought. Instead, enterprises pour resources into migration and adoption at the expense of creating a governance framework meant to manage risks proactively. This oversight leads to the type of major outages and service disruptions we’ve seen recently, which cost companies millions of dollars and erode brand trust. Events like these aren’t inevitable. With proper governance structures in place, much of the fallout can be mitigated or avoided altogether.
Governance in the cloud is not a constraint—it’s an accelerator. A robust governance structure doesn’t just shield enterprises from risk; it also enables them to innovate without fear of missteps. Enterprises with effective governance can adopt emerging technologies like AI confidently, without exposing themselves to compliance, security, or data management pitfalls. But most organizations have trouble operationalizing this philosophy because they lack a road map. This is where Meredith and I saw an opportunity to make a difference.
Creating business value
In writing Unlocking the Power of the Cloud, one of our primary goals was to frame governance not as a bureaucratic hurdle but as a strategic enabler. Governance should elevate decision-making and risk management to an executive level, fostering alignment between IT, compliance, and business objectives. To do this, we take a business-first approach, emphasizing governance as a tool for achieving broader corporate goals.
The book is laser-focused on how cloud governance intersects with three critical areas: artificial intelligence, risk management, and value creation. We argue that these are the most pivotal aspects of cloud adoption today and that ignoring them leaves enterprises exposed to both external threats and internal disorganization.
First, the integration of artificial intelligence within cloud ecosystems represents both an opportunity and a challenge. AI’s ability to drive insights and automation can unlock massive efficiencies, but when deployed irresponsibly, it can lead to catastrophic failures in decision-making or compliance breaches. Governance ensures that AI initiatives are deployed responsibly, with roles and structures in place to monitor, audit, and refine their implementations.
Second, risk management in the cloud is changing faster than most organizations can keep up. Risks that were irrelevant five years ago, such as cloud-native application security or hybrid cloud architecture vulnerabilities, are now front and center. Enterprises must rethink their approach to risk in the cloud, from redefining acceptable levels of exposure to embedding automated tools that dynamically address vulnerabilities before they evolve into crises. In the book, we cover strategies for incorporating dynamic risk management tools, compliance structures, and a culture of accountability throughout an enterprise’s operations.
Finally, we can’t discuss governance without talking about value creation. Governance is too often viewed as a cost center. In reality, when done well, it’s the opposite. By unlocking operational efficiency, uncovering hidden risks, and providing transparency into resource allocation, governance creates a road map for long-term innovation and profitability. Remember that outages, like those we’ve seen recently, are not merely technology failures; they represent failures of governance. By embedding governance into cloud strategies, organizations can prevent unexpected financial losses and lay the groundwork for scalable success.
Lessons from recent outages
Take, for example, the ripple effects of recent high-profile outages. These failures cost organizations millions of dollars in operational downtime, supply chain disruptions, and reputational harm. In most cases, governance lapses were to blame: mismanaged configurations, the absence of monitoring systems, and inefficiencies in response times. These are not inevitable consequences of cloud computing but direct results of failing to prioritize governance as a central pillar of cloud strategy.
The majority of enterprises are rolling the dice. The belief that cloud computing inherently eliminates risks is a dangerous misconception; without guardrails and policies to control how the cloud operates within an organization, risks can grow unchecked. Enterprises are unknowingly declining millions of dollars in potential savings simply because they don’t invest in governance.
The solution to these challenges isn’t abstract or futuristic. It’s attainable today. Cloud governance frameworks are templates for how enterprises can align people, processes, and technologies to minimize risk while maximizing benefits. But understanding their necessity isn’t enough; implementation requires deliberate action, executive sponsorship, and a willingness to overcome initial resistance. The book is a deep dive into all these topics.
Raising the priority of governance
At the heart of this discussion is a critical mindset shift. Organizations must view cloud governance as foundational, not optional. The stakes are too high to get this wrong. A robust governance strategy is essential to survival as enterprises face increasing operational complexities.
Executives, boards, and CIOs must ask themselves a simple question: Can our existing governance strategies weather the next wave of disruption? If the answer isn’t a confident yes, it’s time to act. Done right, governance enables enterprises to scale faster, pivot intelligently, and innovate freely.
This book isn’t intended to sit on a shelf and collect dust. The goal is to spark a broader conversation about prioritizing governance in the age of cloud computing. Modern organizations cannot afford to relegate this conversation to second-tier status. It’s time for enterprises to lead.
AI makes JavaScript programming fun again 7 Nov 2025, 1:00 am
I feel some responsibility to sound a cautionary note amid all the AI fervor, and this report has seen a share of that. But, on the occasion of this November 2025 report, I’d like to instead celebrate AI-driven programming for all it’s worth.
At its best, AI brings back a feeling of excitement and fun to programming. It lifts some of the heavy grunt work off developers, so we can focus on just building things. The thrill of possibility is central to a programmer’s joy, and AI gives us more time to explore possibilities.
There isn’t much AI can do about things like meetings, error logs, and regressions—all the sigh-inducing burdens of the coding life. What it can do is give us more time to explore new tools and improve our coding technique.
In the spirit of building, learning, and changing with the times, here’s the latest in JavaScript goodness.
Top picks for JavaScript readers on InfoWorld
How to vibe code for free, or almost free
What’s more fun than free? Check out these new subscription plans and Chinese open-weight models that deliver high-quality code generation on the cheap.
Intro to Nitro: The server engine built for modern JavaScript
What’s the secret engine powering modern frameworks like Nuxt, SolidStart, and Analog? It’s Nitro. Take some of that time AI assistance saved you and discover something new.
9 vital concepts of modern JavaScript
JavaScript is possibly the single-most integral piece of web technology, and it can also be a sprawling behemoth to learn. Cut through the crud, with these nine concepts every JavaScript developer should know.
What is vibe coding? AI writes the code so developers can think big
Believe it or not, there’s already something known as “traditional AI coding,” and vibe coding isn’t it. Here’s a quick rundown on the current state and possibilities—and dangers—of AI-driven software development.
More good reads and JavaScript updates elsewhere
Vercel now supports Bun runtime
Vercel’s support for the Bun runtime (in beta) is a bigger deal than you might think. This moves way beyond just using bun install—it means your Next.js apps and server functions can now execute on Bun’s hyper-fast, Zig-built engine. You can also use native calls like Bun.SQL without an adapter.
Bun 1.3 drops
Bun’s development team says version 1.3 is their “biggest release yet.” It solidifies Bun as a batteries-included, full-stack runtime with a native MySQL client (unifying Bun.SQL with Postgres and SQLite), a built-in Redis client, and a full-stack dev server with hot reloading and advanced routing. Believe it or not, there is a ton more in this release.
Making JavaScript web transactions more trustworthy
JavaScript supply-chain attacks have become a thing. A single compromised ad or analytics script can become a “Magecart attack,” stealing user credit cards. This article from Cloudflare describes a new, free tool that automatically blocks attacks and alerts you.
Last chance to participate in the State of JS 2025 survey
As of this writing, the annual developer survey is still accepting responses. There’s still time to add your thoughts about the JavaScript programming experience and tools in 2025.
Malicious npm packages contain Vidar infostealer 6 Nov 2025, 6:27 pm
Malicious code continues to be uploaded to open source repositories, making it a challenge for responsible developers to trust what’s there, and for CISOs to trust applications that include open source code.
The latest example comes from researchers at Datadog Security, who said that last month they found 17 packages (23 releases) in the npm repository that contained downloader malware for Windows systems that executes via a postinstall script.
The associated packages masquerade as Telegram bot helper packages, icon libraries, or legitimate-seeming forks of preexisting projects such as Cursor and React. They provide legitimate functionality, but their actual goal is to execute the Vidar infostealer malware on the victim system. Datadog believes this is the first public disclosure of Vidar malware being delivered via npm packages.
Both of the accounts offering these packages (aartje and saliii229911 ) have since been banned. However, they were on the registry for about two weeks, and the malicious packages were downloaded at least 2,240 times. However, the researchers believe many of those downloads were likely by automated scrapers, with some occurring after the packages had been removed and replaced with empty security holding packages.
All sorts of nasty things
Malicious compromise of open source components can lead to all sorts of nasty things. First, threat actors can steal developers’ credentials and insert backdoors into their code. Second, the malicious code in the downloaded component itself could spread around the world to the developer’s customers.
The Datadog discovery is just another in a long list of malicious code uploaded to npm, PyPI, GitHub, and other open source repositories.
Last week, Koi Security reported finding 126 malicious packages in npm, and in September, researchers at Step Security reported that dozens of npm libraries had been replaced with credential stealing code. The same month, researchers at Aikido reported that 18 highly popular and highly downloaded npm packages had been contaminated.
“I don’t know how to easily solve this problem without requiring a full security view of any newly submitted code, and that’s not fast, cheap, or easy,” commented Roger Grimes, digital defence CISO advisor at KnowBe4.
“But it really is the only answer if you want reliable, safe, open source code.”
Ironically, he said, one of the biggest reasons given for the world to use open source code is that it’s readily reviewable, so anyone can look at it to see and stop vulnerabilities. “But the reality is that almost no one security reviews any of the tens of millions of lines of open source code,” he pointed out.
“There have been dozens of open source projects that attempted to implement more default code review and all have failed,” he said. “One of my favorite related quotes of all time is, ‘Asking for users to review open source code before using is like asking passengers of an airliner to step outside the jet and review it for flight safety before they fly.’ I’m not sure who said that first, but it’s a brilliant summary of why volunteer open source code review really doesn’t work.”
Typosquatting
One favorite tactic of threat actors trying to infect the open source software supply chain is typosquatting, the creation of packages with names similar to those of legitimate ones to trick unwitting developers searching for a particular library. For example, in 2018 a researcher found that threat actors had created phony libraries in the Python repository called ‘diango,’ ‘djago,’ ‘dajngo,’ to dupe developers seeking the popular ‘django’ Python library.
CISOs should ensure that employees are educated about the issue of typosquatting and learn what to look for. IT departments should keep a comprehensive inventory of what components are used by all approved software against which audits can be conducted, to ensure only approved components are in place. This inventory and audit should be performed to validate any new components that are introduced.
What more to do?
There’s no shortage of advice for developers and IT and infosec leaders to help them avoid being victimized by malicious packages in open source repositories.
One tactic is to include a software bill of materials in every application an IT department acquires. With it, the DevOps/DevSecOps teams can track software components, identify vulnerabilities, and ensure compliance.
In 2021, the US Cybersecurity and Infrastructure Security Agency (CISA) and the US National Institute for Standards and Technology (NIST) published an advisory, Defending Against Software Supply Chain Attacks, providing advice for creating secure open source apps. It starts with the creation of a formal supply chain risk management program to ensure that supply chain risk receives attention across the organization, even among executives and managers within operations and personnel across supporting roles, such as IT, acquisitions, legal, risk management, and security.
An organization can reduce its software attack surface through configuration management, the advisory says, which includes:
- placing configurations under change control;
- conducting security impact analyses;
- implementing manufacturer-provided guidelines to harden software, operating systems, and firmware;
- • maintaining an information system component inventory.
In addition, the Open Source Web Application Security Project (OWASP) offers this advice to developers using npm:
- always vet and perform due diligence on third-party modules that you install to confirm their health and credibility;
- hold off on immediate upgrades to new versions; allow new package versions some time to circulate before trying them out.
- before upgrading, make sure to review changelogs and release notes for the upgraded version.
- when installing packages, make sure to add the ignore-scripts suffix to disable the execution of any scripts by third-party packages.
- consider adding ignore-scripts to the .npmrc project file, or to the global npm configuration.
Finally, Andrew Krug, Datadog’s head of security advocacy, offered these additional tips:
- give developers the ability to install real-time package scanning at installation;
- guard against typosquatting and dependency confusion by prioritizing the use of internal package repositories as a guardrail for approved packages;
- maintain software bills of materials;
- Deploy SCA (software composition analysis) at every phase of the software development lifecycle. Traditional SCA tools only periodically analyze code snapshots, he said, but effective detection must be complemented with real-time visibility into deployed services, including production, to reprioritize issues and focus on those exposed in sensitive environments.
Google’s cheaper, faster TPUs are here, while users of other AI processors face a supply crunch 6 Nov 2025, 12:02 pm
Relief could be on the way for enterprises facing shortages of GPUs to run their AI workloads, or unable to afford the electricity to power them: Google will add Ironwood, a faster, more energy-efficient version of its Tensor Processing Unit (TPU), to its cloud computing offering in the coming weeks.
Analysts expect Ironwood to offer price-performance similar to GPUs from AMD and Nvidia, running in Google’s cloud, so this could ease the pressure on enterprises and vendors struggling to secure GPUs for AI model training or inferencing projects.
That would be particularly welcome as enterprises grapple with a global shortage of high-end GPUs that is driving up costs and slowing AI deployment timelines, and even those who have the GPUs can’t always get the electricity to operate them.
That doesn’t mean it will be all plain sailing for Google and its TPU customers, though: Myron Xie, a research analyst at SemiAnalysis, warned that Google might also face constraints in terms of chip manufacturing capacity at Taiwan Semiconductor Manufacturing Company (TSMC), which is facing bottlenecks around limited capacity for advanced chip packaging.
Designed for TensorFlow
Ironwood is the seventh generation of Google’s TPU platform, and was designed alongside TensorFlow, Google’s open-source machine learning framework.
That gives the chips an edge over GPUs in general for common in AI workloads built for TensorFlow, said Omdia principal analyst Alexander Harrowell. Many AI models, especially in research and enterprise scenarios, are built using TensorFlow, he said, and the TPUs are highly optimized for such operations while general-purpose GPUs that support multiple frameworks aren’t as specialized.
Opportunities for the AI industry
LLM vendors such as OpenAI and Anthropic, which still have relatively young code bases and are continuously evolving them, also have much to gain from the arrival of Ironwood for training their models, said Forrester vice president and principal analyst Charlie Dai.
In fact, Anthropic has already agreed to procure 1 million TPUs for training and its models and using them for inferencing. Other, smaller vendors using Google’s TPUs for training models include Lightricks and Essential AI.
Google has seen a steady increase in demand for its TPUs (which it also uses to run interna services), and is expected to buy $9.8 billion worth of TPUs from Broadcom this year, compared to $6.2 billion and $2.04 billion in 2024 and 2023 respectively, according to Harrowell.
“This makes them the second-biggest AI chip program for cloud and enterprise data centers, just tailing Nvidia, with approximately 5% of the market. Nvidia owns about 78% of the market,” Harrowell said.
The legacy problem
While some analysts were optimistic about the prospects for TPUs in the enterprise, IDC research director Brandon Hoff said enterprises will most likely to stay away from Ironwood or TPUs in general because of their existing code base written for other platforms.
“For enterprise customers who are writing their own inferencing, they will be tied into Nvidia’s software platform,” Hoff said, referring to CUDA, the software platform that runs on Nvidia GPUs. CUDA was released to the public in 2007, while the first version of TensorFlow has only been around since 2015.
This article first appeared on Network World.
Tabnine launches ‘org-native’ AI agent platform 6 Nov 2025, 12:01 pm
Tabnine has launched the Tabnine Agentic Platform for AI-assisted software development with coding agents. The platform enables enterprise software development teams to ship faster while maintaining control over code and context, the company said.
With Tabnine Agentic, introduced November 5, developers get autonomous coding partners that complete workflows, not just code suggestions or completions, all aligned with an organization’s standards and security policies, Tabnine said. Powered by the Tabnine Enterprise Context Engine, Tabnine’s “org-native” agents understand the users’ repositories, tools, and policies and use these artifacts to plan, execute, and validate multi-step development tasks, Tabnine said. Agent tasks include refactoring, debugging, and documentation. The engine incorporates coding standards, source and log files, and ticketing systems. Tabnine agents execute complete coding workflows, offering security and context, according to Tabnine.
Tabnine Agents can use external systems and tools to adapt to new codebases and polices without retraining or redeployment. The engine combines vector, graph, and agentic retrieval techniques to interpret relationships across codebases, tickets, and tools, enabling Tabnine’s org-native agents to reason through multi-step workflows, the company said. Enterprise-grade benefits cited include:
- Agents can automatically adapt to new codebases and policies, with no retraining or redeployment required.
- Agents can act and iterate autonomously through coding workflows.
- Centralized control ensures oversight of permissions, usage, and context.
- Contextual intelligence provides awareness of internal repositories, ticketing systems, and coding guidelines.
- SaaS, private, VPC, on-premises, and air-gapped deployments are all available and meet enterprise security standards.
Perplexity’s open-source tool to run trillion-parameter models without costly upgrades 6 Nov 2025, 4:52 am
Perplexity AI has released an open-source software tool that solves two expensive problems for enterprises running AI systems: being locked into a single cloud provider and the need to buy the latest hardware to run massive models.
The tool, called TransferEngine, enables large language models to communicate across different cloud providers’ hardware at full speed. Companies can now run trillion-parameter models like DeepSeek V3 and Kimi K2 on older H100 and H200 GPU systems instead of waiting for expensive next-generation hardware, Perplexity wrote in a research paper. The company also open-sourced the tool on GitHub.
“Existing implementations are locked to specific Network Interface Controllers, hindering integration into inference engines and portability across hardware providers,” the researchers wrote in their paper.
The vendor lock-in trap
That lock-in stems from a fundamental technical incompatibility, according to the research. Cloud providers use different networking protocols for high-speed GPU communication. Nvidia’s ConnectX chips use one standard, while AWS’s Elastic Fabric Adapter (AWS EFA) uses an entirely different proprietary protocol.
Previous solutions worked on one system or the other, but not both, the paper noted. This forced companies to commit to a single provider’s ecosystem, or accept dramatically slower performance.
The problem is particularly acute with newer Mixture-of-Experts models, Perplexity found. DeepSeek V3 packs 671 billion parameters. Kimi K2 hits a full trillion. These models are too large to fit on single eight-GPU systems, according to the research.
The obvious answer would be Nvidia’s new GB200 systems, essentially one giant 72-GPU server. But those cost millions, face extreme supply shortages, and aren’t available everywhere, the researchers noted. Meanwhile, H100 and H200 systems are plentiful and relatively cheap.
The catch: running large models across multiple older systems has traditionally meant brutal performance penalties. “There are no viable cross-provider solutions for LLM inference,” the research team wrote, noting that existing libraries either lack AWS support entirely or suffer severe performance degradation on Amazon’s hardware.
TransferEngine aims to change that. “TransferEngine enables portable point-to-point communication for modern LLM architectures, avoiding vendor lock-in while complementing collective libraries for cloud-native deployments,” the researchers wrote.
How TransferEngine works
TransferEngine acts as a universal translator for GPU-to-GPU communication, according to the paper. It creates a common interface that works across different networking hardware by identifying the core functionality shared by various systems.
TransferEngine uses RDMA (Remote Direct Memory Access) technology. This allows computers to transfer data directly between graphics cards without involving the main processor—think of it as a dedicated express lane between chips.
Perplexity’s implementation achieved 400 gigabits per second throughput on both Nvidia ConnectX-7 and AWS EFA, matching existing single-platform solutions. TransferEngine also supports using multiple network cards per GPU, aggregating bandwidth for even faster communication.
“We address portability by leveraging the common functionality across heterogeneous RDMA hardware,” the paper explained, noting that the approach works by creating “a reliable abstraction without ordering guarantees” over the underlying protocols.
Already live in production environments
The technology isn’t just theoretical. Perplexity has been using TransferEngine in production to power its AI search engine, according to the company.
The company deployed it across three critical systems. For disaggregated inference, TransferEngine handles the high-speed transfer of cached data between servers, allowing companies to scale their AI services dynamically. The library also powers Perplexity’s reinforcement learning system, achieving weight updates for trillion-parameter models in just 1.3 seconds, the researchers said.
Perhaps most significantly, Perplexity implemented TransferEngine for Mixture-of-Experts routing. These models route different requests to different “experts” within the model, creating far more network traffic than traditional models. DeepSeek built its own DeepEP framework to handle this, but it only worked on Nvidia ConnectX hardware, according to the paper.
TransferEngine matched DeepEP’s performance on ConnectX-7, the researchers said. More importantly, they said it achieved “state-of-the-art latency” on Nvidia hardware while creating “the first viable implementation compatible with AWS EFA.”
In testing DeepSeek V3 and Kimi K2 on AWS H200 instances, Perplexity found substantial performance gains when distributing models across multiple nodes, particularly at medium batch sizes, the sweet spot for production serving.
The open-source bet
Perplexity’s decision to open-source production infrastructure contrasts sharply with competitors like OpenAI and Anthropic, which keep their technical implementations proprietary.
The company released the complete library, including code, Python bindings, and benchmarking tools, under an open license.
The move mirrors Meta’s strategy with PyTorch — open-source a critical tool, help establish an industry standard, and benefit from community contributions. Perplexity said it’s continuing to optimize the technology for AWS, following updates to Amazon’s networking libraries to further reduce latency.
Flaw in React Native CLI opens dev servers to attacks 6 Nov 2025, 4:33 am
A critical remote-code execution (RCE) flaw in the widely used @react-native-community/cli (and its server API) lets attackers run arbitrary OS commands via the Metro development server, the default JavaScript bundler for React Native.
In essence, launching the development server through standard commands (eg, npm start or npx react-native start) could expose the machine to external attackers, because the server binds to all network interfaces by default (0.0.0.0), rather than limiting itself to “localhost” as it says in the console message.
According to JFrog researchers, the bug is a severe issue threatening developers of React Native apps. While exploitation on Windows is well-demonstrated (full OS command execution via unsafe open() call), the macOS/Linux paths are currently less straightforward–though the risk remains real and subject to further research.
A fix is available, but development teams must move fast, JFrog researchers warned in a blog post.
Weak development server defaults
The vulnerability arises because the Metro development server, which started using the CLI tool, exposes a “/open-url” HTTP endpoint that takes a URL parameter from a POST request and passes it directly to the “open()” function in the open NPM package. On Windows, this can spawn an “smd /c..” call, enabling arbitrary command execution.
Adding to the problem is a misconfiguration in the CLI, which prints that the server is listening on “localhost”, but under the hood, the host values end up undefined, and the server listens on 0.0.0.0 by default, opening it to all external networks.
This combination of insecure default binding and the flawed open() call creates the conditions for remote code execution, something rare and dangerous in a development-only tool.
“This vulnerability shows that even straightforward Remote Code Execution flaws, such as passing user input to the system shell, are still found in real-world software, especially in cases where the dangerous sink function actually resides in 3rd-party code, which was the imported “open” function in this case,” the researchers said.
The bug, tracked as CVE-2025-11953, is assigned a CVSS score of 9.8 out of 10, and affects versions 4.8.0 through 20.0.0-alpha.2.
What must developers do now?
Developers using @react-native-community/cli (or the bundled cli-server-api) in their React Native projects should check for the vulnerable package version on the npm list. The vulnerability is fixed in version 20.0.0 of cli-server-api, so immediate updating is recommended.
The stakes include an attacker remotely executing commands on the victim’s development machine, potentially leading to broader network access, code corruption, or injecting malicious payloads into an app build. If updating isn’t feasible right away, JFrog advised restricting the dev server to localhost by explicitly passing the “–host 127.0.0.1” flag to reduce exposure.
“It’s a reminder that secure coding practices and automated security scanning are essential for preventing these easily exploitable flaws before they make it to production,” the researchers said, recommending JFrog SAST for identifying issues early in the development process.
The React Native CLI flaw mirrors a broader trend of attackers slipping into developer ecosystems, from npm packages with hidden payloads to rogue “verified” IDE extensions, turning trusted build tools into stealthy points of entry.
Google boosts Vertex AI Agent Builder with new observability and deployment tools 6 Nov 2025, 3:44 am
Google Cloud has updated its Vertex AI Agent Builder with new observability dashboards, faster build-and-deploy tools, and stronger governance controls, aiming to make it easier for developers to move AI agents from prototype to production at scale.
The update adds an observability dashboard within the Agent Engine runtime to track token usage, latency, and error rates, along with a new evaluation layer that can simulate user interactions to test agent reliability.
Developers can now deploy agents to production with a single command using the Agent Development Kit (ADK), Google said in a blog post. New governance tools, such as agent identities tied to Cloud IAM and Model Armor, which block prompt injection attacks, are designed to improve security and compliance.
The ADK, which Google says has been downloaded more than seven million times, now supports Go in addition to Python and Java. This broader language support is aimed at making the framework accessible to a wider developer base and improving flexibility for enterprise teams building on multi-language stacks.
Google has also expanded managed services within the Agent Engine runtime. Developers can now deploy to the Agent Engine runtime directly from the ADK command-line interface without creating a full Google Cloud account. A Gmail address is enough to start using the service, with a free 90-day trial available for testing.
Agents built with Vertex AI Agent Builder can also be registered within Gemini Enterprise, giving employees access to custom-built agents in one workspace and linking internal tools with generative AI workflows.
The race to provide developer-friendly tools for creating secure and scalable agentic systems reflects a wider shift in enterprise AI. With the latest updates, Google is strengthening its position against competition that includes Microsoft’s Azure AI Foundry and AWS Bedrock.
Developer productivity gains
The updates are intended to make it easier to build and scale AI agents while enhancing governance and security controls.
“By turning orchestration, environment setup, and runtime management into managed services, Google’s Agent Development Kit cuts down on the time it takes to create and deploy software,” said Dhiraj Badgujar, senior research manager at IDC. “Vertex’s built-in model registry, IAM, and deployment fabric can shorten early development cycles for enterprises who are already using GCP.”
“LangChain and Azure AI Foundry provide for more model/cloud interoperability and manual flexibility, but they need more setup and bespoke integration to reach the same level of scalability, monitoring, and environment parity,” Badgujar added. “For new projects that fit with GCP, ADK may speed up development cycles by 2–3 times.”
Charlie Dai, VP and principal analyst at Forrester, agreed that Google’s new capabilities streamline the development process. “Compared to other offerings that often require custom pipelines and integration steps, Google’s approach can cut iteration time for teams already on Vertex AI,” Dai added.
Tulika Sheel, senior VP at Kadence International, noted that the ADK and one-click deployment in Vertex AI Agent Builder simplify agent creation by reducing setup and integration effort.
“For highly custom or niche workflows, the flexibility of open-framework solutions still wins, but for many enterprises seeking faster time-to-value, Google’s offering could be a real accelerator,” Sheel added.
The upgrade also represents a reset in how enterprises move from prototype to production, according to Sanchit Vir Gogia, chief analyst, founder, and CEO of Greyhound Research.
“For years, teams have been slowed by the hand-offs between development, security, and operations,” Gogia said. “Each phase added new tools, new reviews, and fresh delays. Google has pulled those pieces into one track. A developer can now build, test, and release an agent that already fits inside corporate policy.”
Observability and evaluation features
Analysts view Google’s new observability and evaluation tools as a significant improvement, though they say the capabilities are still developing for large-scale and non-deterministic agent workflows.
“The features in Vertex AI Agent Builder are a solid step forward but remain early-stage for complex, non-deterministic agent debugging,” Dai said. “While they provide granular metrics and traceability, integration with OpenTelemetry or Datadog is possible through custom connectors but not yet native.”
Others agreed that the tools are not yet full-stack mature. The latest updates enable real-time and retrospective debugging with agent-level tracing, tool auditing, and orchestrator visualization, along with evaluation using both metric-based and LLM-based regression testing.
“ADK gives GCP-native agents a lot of visibility, but multi-cloud observability is still not mature,” Badgujar said. “The new features make debugging non-deterministic flows a lot easier, although deep correlation across multi-agent states still needs third-party telemetry.”
Sheel echoed similar thoughts while acknowledging that the features are promising.
“At this stage, they’re still maturing,” Sheel said. “Enterprise uses with complex non-deterministic workflows (multi-agent orchestration, tool chains) will likely require additional monitoring hooks, custom dashboards, and metric extensions.”
Databricks adds customizable evaluation tools to boost AI agent accuracy 6 Nov 2025, 3:28 am
Databricks is expanding the evaluation capabilities of its Agent Bricks interface with three new features that are expected to help enterprises improve the accuracy and reliability of AI agents.
Agent Bricks, released in beta in June, is a generative AI-driven automated interface that streamlines agent development for enterprises and combines technologies developed by MosaicML, including TAO, the synthetic data generation API, and the Mosaic Agent platform.
The new features, which include Agent-as-a-Judge, Tunable Judges, and Judge Builder, enhance Agent Bricks’ automated evaluation system with more flexibility and customization, Craig Wiley, senior director of product management at Databricks, told InfoWorld.
Agent Bricks’ automated evaluation system can generate evaluation benchmarks via an LLM judge based on the defined agent task or workflow, often using synthetic data, to assess agent performance as part of its auto-optimization loop.
However, it didn’t offer an automated ability for developers to dig through the agent’s execution trace to find relevant steps without writing code.
One of the new features, Agent-as-a-Judge, offers that capability for developers, saving time and complexity while offering insights into an agent’s trace that can make evaluations more accurate.
“It’s a new capability that makes those automated evaluations even smarter and more adaptable — adding intelligence that can automatically identify which parts of an agent’s trace to evaluate, removing the need for developers to write or maintain complex traversal logic,” Wiley said.
AI and data consultancy firm Asperitas Consulting’s agentic AI enablement principal Derek Ashmore, too, feels that Agent-as-a-Judge offers a more flexible and explainable way to assess AI agent accuracy than the automated scoring that originally shipped with Agent Bricks.
Tunable Judges for agents with domain expertise
Another feature, Tunable Judges, is designed to give enterprises the flexibility to tune LLM judges for agents with domain expertise, which is a growing requirement in enterprise production environments.
“Enterprises value domain experts’ input to ensure accurate evaluations that reflect unique contexts, business needs, or compliance standards,” said Robert Kramer, principal analyst at Moor Strategy and Insights. “When Agent Bricks was initially introduced, many enterprises welcomed the ability to automate the evaluation and assessment of agents based on quality. As these agents transitioned from prototypes to a more demanding production environment, the limitations of generic evaluation logic became evident,” Kramer added.
Tunable Judges was the result of customer feedback, specifically on capturing subject matter expertise accurately and letting enterprises define what “correctness” is applicable to their agents, Wiley said.
Tunable Judges could be used in ensuring that clinical summaries don’t omit contraindications in healthcare, or in enforcing compliant language in portfolio recommendations, and evaluating tone, de-escalation accuracy, or even in policy adherence in customer support.
Enterprises have the option of using the new “make_judge” SDK introduced in MLflow 3.4.0 to create custom LLM judges by defining tailored evaluation criteria in natural language within Python code and running an evaluation on it.
Easing the complexity of agent evaluation
Enterprises would also have the option of using Judge Builder, a new visual interface within Databricks’ workspace, to create and tune LLM judges with domain knowledge from subject matter experts and utilize the Agent-as-a-Judge capability.
The Judge Builder, according to Kramer, is Databricks’s effort to set itself apart from rivals such as Snowflake, Salesforce, and ServiceNow, which also offer agent evaluation features, by making agent evaluation less complex and customizable.
“Snowflake’s agent tools use frameworks to check quality, but they don’t let you tune checks with business-specific feedback or domain rules in the same way Databricks does,” Kramer said.
Snowflake already offers AI observability and Cortex Agents, including “LLM-as-a-judge” evaluations, which focus on measuring accuracy and performance rather than interpreting an agent’s full execution trace.
Comparing Databricks’ new agent evaluation tools to those of Salesforce and ServiceNow, Kramer said that both vendors mostly focus on automating workflows and outcomes without deep, tunable agent judgment options. “If you need really tailored compliance or want business experts involved in agent quality, Databricks has the edge. For more basic automations, these differences probably matter less,” Kramer added.
Microsoft steers Aspire to a polyglot future 6 Nov 2025, 1:00 am
Microsoft’s Aspire development framework has dropped .NET from its name and moved to a new website, as it is now becoming a general-purpose environment for building, testing, and deploying scalable cross-cloud applications. Aspire has already proven to be a powerful tool for quickly creating cloud-native C# code. Is it ready to support other pieces of the modern development stack?
I’ve looked at Aspire before as it’s long been one of the more interesting parts of Microsoft’s developer tools, taking a code-first approach to all aspects of development and configuration. Instead of a sprawl of different (often YAML) files to configure services and platforms, Aspire uses a single code-based AppHost file that describes your application and the services it needs to run.
Along with the platform team at Microsoft, the growing Aspire community is developing an expanding set of integrations for what Aspire calls resources: applications, languages, runtimes, and services. There’s a standard format for building integrations that makes it easy to build your own and share them with the rest of the Aspire community, adding hooks for code and OpenTelemetry providers for Aspire’s dashboard.
Making Aspire cross-language
How does Aspire go from a .NET tool to supporting a wider set of platforms? Much of its capability comes from its architecture and its code-based approach to defining the components of your applications.
Using AppHost to bring together your code is the key to building polyglot applications in Azure. It lets you mix and match the code you need: a React front end, a Python data and AI layer, and Go services for business logic. You define how they’re called and how they’re deployed, whether for test or for production, on your PC or in the cloud.
Such an approach builds on familiar tools. There’s no difference between instantiating a custom Go application in a container and doing the same for an application like Redis. The only difference is whether you use a Microsoft-provided integration, one from the growing Aspire community, or one you’ve built yourself.
If you want to use, say, a Node.js component as part of an Aspire application, use the Aspire command line (or Visual Studio) to add the Node.js hosting library to your project. With a prebuilt application using Express or a similar framework, your AppHost simply needs to add a call to Aspire’s builder method using AddNodeApp for an application or AddNpmApp for one that’s packaged for Node’s package manager.
Node.js needs to be installed on your development and production systems, with code providing an appropriate REST API that can be consumed by the rest of your application. If you have other JavaScript code, like a React front end, it can be launched using the same tooling, packaging them all in separate Docker files.
Aspire Community Toolkit
An important piece of Aspire’s polyglot future is the Aspire Community Toolkit. This is a library of tools for hosting code and integrating with services that may not be in the official release yet. It gives you the tools to quickly extend Aspire in the direction you need without having to wait for a full internal review cycle. You get to move faster, albeit with the risks of not being able to use official support resources or of working with features that may not be quite ready for production.
If you use features from the Aspire Community Toolkit in your AppHost, you’re able to start with cutting-edge tools to build applications, like the Bun and Deno JavaScript/TypeScript environments, or you can work with memory-safe Go and Rust code. You can even bring in legacy Java code with support for a local JDK and popular enterprise frameworks like Spring.
There’s a long list of integrations as part of the Aspire Community Toolkit documentation, covering languages and runtimes, multiple container types and containerized services, and additional databases. If you want to use a specific client for a service, the toolkit includes a set of useful tools that can simplify working with APIs, including using popular .NET features like the Entity Framework. There is support for using Aspire to work with mock services during development, so you can connect to dummy mail servers and the like, swapping for live services in production.
Aspire Community Toolkit integrations began life as new custom integrations, which you can use to create your own links to internal services or to external APIs to use in your Aspire applications. For now, most integrations are written using .NET, adding custom resources and configurations.
At the heart of an integration is the Aspire.Hosting package reference. This is used to link the methods and resources in a class library to your Aspire integration.
Adding custom integrations
Building a new hosting integration starts with a project that’s designed to test that integration, which will initially be a basic AppHost that we’ll use to connect to the integration and display it in the Aspire dashboard. If you run the test project, you’ll see basic diagnostics and a blank dashboard.
Next, we need to create another project to host our new resources. This time we’re creating a .NET class library, adding the Aspire.Hosting package to this project. While it’s still blank, it can now be added as a reference to the test project. First make sure that the class library is treated as a non-service project by editing its project reference file. This will stop the project failing to run.
We’re now ready to start writing the code to implement the resource we’re building an integration for. Resources are added to the Aspire.Hosting.ApplicationModel namespace, with endpoint references and any necessary connection strings. This is where Aspire code will integrate with your new resource, providing a platform-agnostic link between application and service.
Your project now needs an extension method to handle configuration, using Aspire’s builder method to download and launch the container that hosts the service you’re adding. If you’re targeting a static endpoint, say a SAP application or similar with a REST API, you can simply define the endpoint used, either HTTP or a custom value.
With this in place, your new integration is ready for use, and you can write code that launches it and works with its endpoints. In production, of course, you’ll need to ensure that your endpoints are secure and that messages sent to and from them are sanitized. That also means making sure your application deployment runs on private networks and isn’t exposed to the wider internet, so be sure to consider how your provider configures its networking.
You can simplify things by ensuring that your integration publishes a manifest that contains details like host names and ports. Once you have a working integration, you’re able to package it as a NuGet package for sharing with colleagues or the wider internet.
A community to build the future
Moving from a .NET-only Aspire to one that supports the tools and platforms you want to use makes a lot of sense for Microsoft. Cloud-native, distributed applications are hard to build and run, so anything that simplifies both development and operations should make a lot of developers’ lives easier. By adopting a code-based approach to application architecture and services, Aspire embodies key devops principles and bakes them into the software development process.
For now, there will still be dependencies on .NET in Aspire, even though you can build integrations for any language or platform—or any application endpoint, for that matter. There are some complexities associated with building integrations, but we can expect the process to become a lot simpler as more developers adopt the platform and as they start to share their own integrations with the community. This is perhaps key to this change of direction in Aspire. If it is to be successful as a polyglot application development tool, it needs to have buy-in, not only from its existing core developers, but from experts in all the languages and services it wants to consume so that we are able to build the best possible code.
Building a bigger community of engaged contributors is key to Aspire’s future. Emphasizing features like the Aspire Community Toolkit as a way for integrations to graduate from being experiments to being part of the platform will be essential to any success.
Developers don’t care about Kubernetes clusters 6 Nov 2025, 1:00 am
If you look at the Cloud Native Computing Foundation landscape it might seem cloud developers are a lucky bunch. There seems to be an existing tool for literally every part of the software development life cycle. This means that developers can focus on what they want (i.e., creating features) while everything else (e.g., continuous integration and deployment) is already in place. Right?
Not so fast. The CNCF landscape tells only part of the story. If you look at the cloud tools available, you might think that everything is covered and we actually have more tools than needed.
The problem, however, is that the cloud ecosystem right now has the wrong focus. Most of the tools available are destined for administrators and operators instead of feature developers. This creates a paradox where the more tools your organization adopts, the less happy are your developers. Can we avoid this?
Looking beyond the clusters
It was only natural that the first cloud tools would be about creating infrastructure. After all, you need a place to run your application, in order to offer value to your end users. The clear winner in the cloud ecosystem is Kubernetes, and many tools revolve around it. Most of these tools only deal with the cluster itself. You can find great tools today that
- Create Kubernetes clusters
- Monitor Kubernetes clusters
- Debug Kubernetes clusters
- Network and secure Kubernetes clusters
- Auto-scale and cost-optimize the cluster according to load
This is a great starting point, but it doesn’t actually help developers in any way. Developers only care about shipping features. Kubernetes is a technical detail for them, as virtual machines were before Kubernetes.
The problem is that almost all the tools available focus on individual clusters. If your organization is using any kind of Kubernetes dashboard, I would bet that on the left sidebar there is a nice big button called “clusters” that shows me a list of all available Kubernetes installations.
But here is the hard truth. Developers don’t care about Kubernetes clusters. They care about environments—more specifically the classic trilogy of QA, staging, and production. That’s it.
Maybe in your organization Staging is a single cluster. Maybe Staging is two clusters. Maybe Staging is a namespace inside another bigger cluster. Maybe Staging is even a virtual cluster. It doesn’t really matter for developers. All they want to see is an easy way to deploy their features from one environment to the next.
If you want to make life easy for developers, then offer them what they actually need.
- A list of predefined environments with a logical progression structure
- A way to “deploy” their application to those environments without any deep Kubernetes knowledge
- An easy way to create temporary preview environments for testing a new feature in isolation
- A powerful tool to debug deployments when things go wrong.
In this manner, developers will be able to focus on what actually matters to them. If you force developers to learn Helm, Kustomize, or how Kubernetes manifests work, you are wasting their time. If every time a deployment fails, your answer is “just use kubectl to debug the cluster,” then you are doing it wrong.
Promotions are more critical than deployments
So, let’s say you followed my advice and offered your developers a nice dashboard that presents environments instead of individual clusters. Is that enough?
It turns out that you must also offer a way to “deploy” to those environments. But here is the critical point. Make sure that your fancy dashboard understands the difference between a deployment and a promotion.
A deployment to an environment should be a straightforward process. A developer must be able to choose
- A version of their application (the latest one or a previous one)
- An environment with appropriate access
- A way to make sure that the deployment has finished successfully.
Sounds simple, right? It is simple, but this process is only useful for the first environment where code needs to be verified. For the rest of the environments in the chains, developers want to promote their application.
Unlike a deployment, a promotion is a bit more complex. It means that a developer wants to take what is already available in the previous environments (e.g., QA) and move that very same package in the next environment (e.g., Staging).
The magic here is that if you look at all the environments of the organization there is a constant battle between how “similar” two environments are. In the most classic example, your Staging environment should be as close as possible to Production in order to make sure that you test your application in similar conditions.
On the other hand, it should be obvious that Staging should not be able to access your production database or your production queues. You should have separate infrastructure for handling Staging data.
This means that by definition some configuration settings (e.g., database credentials) are different between Production and Staging. So when a developer wants to “promote” an application, what they really want to do is
- Take the parts of the application that actually need to move from one environment to another
- Ignore the configuration settings that stay constant between environments.
This is a very important distinction, and the majority of cloud tools do not understand it. Many production incidents start because either a configuration change was different between production and staging (or whichever was the previous environment) or because the developer deployed the wrong version to production, bypassing the previous environments.
Coming back to your developer dashboard, if you offer your developers just a drop down list from all possible versions of an application allowing them to choose what to deploy, you are doing it wrong. What developers really want is to promote whatever is active and verified in the previous environment.
At least for production this should be enforced at all times. Production deployments are the last step in a chain where a software version is gradually moved from one environment to another.
Behind the scenes, your fancy dashboard should also understand what configuration needs to be promoted and what configuration stays the same for each environment. In the case of Kubernetes, for example, the number of replicas for each environment is probably static. But your application’s configmaps should move from one environment to another when a promotion happens.
Deployment pipelines no longer work in the cloud era
We have covered environments and promotions so it is time to talk how exactly a deployment takes place. The traditional way of deploying an application is via pipelines. Most continuous integration software has a way of creating pipelines as a series of steps (or scripts) that execute one after the other.
The typical pipeline consists of:
- Checking out the source code of the application
- Compiling and building the code
- Running unit and integration tests
- Scanning the code for vulnerabilities
- Packaging the code in its final deliverable.
Before the cloud, it was common to have another step in the pipeline that took the binary artifact and deployed it to a machine (via FTP, rsync, SSH, etc.). The problem with this approach is that the pipeline only knows what is happening while the pipeline is running. Once the pipeline has finished, it no longer has visibility into what is happening in the cluster.
This creates a very unfortunate situation for developers, with the following pattern:
- A developer is ready to perform a deployment
- They start the respective pipeline in the continuous integration dashboard
- The pipeline runs successfully and deploys the application to the cluster
- The pipeline ends with a “green” status
- Five minutes later the application faces an issue (e.g., slow requests, missing database, evicted pods)
- The developer still sees as “green” the pipeline and has no way of understanding what went wrong.
It is at this point that developers are forced to look at complex metrics or other external systems in order to understand what went wrong. But developers shouldn’t have to look in multiple places to understand if their deployment is OK or not.
Your deployment system should also monitor applications, even after the initial deployment has finished. This is an absolute requirement for cloud environments where resources come and go—especially in the case of Kubernetes clusters, where autoscaling is in constant effect.
Catching up with cloud deployments
Cloud computing comes with its own challenges. Most existing tools were create before the cloud revolution and were never designed for the dynamic nature of how cloud deployments work. In this new era, developers are left behind because nobody really understands what they need.
In the case of Kubernetes, existing tools tend to be oriented towards operators and administrators:
- They show too much low-level information that is not useful for developers
- They don’t understand how environments are different and how to promote applications
- They still have the old mindset of continuous integration pipelines.
We need to rethink how cloud computing affects developers. With the recent surge in generative AI and LLM tools, deploying applications will quickly become the bottleneck. Developers will be able to quickly create features with their smart IDEs or AI agents, but they will never understand how to promote applications or how to quickly pinpoint what was the issue of a failed deployment.
—
New Tech Forum provides a venue for technology leaders—including vendors and other outside contributors—to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all inquiries to doug_dineley@foundryco.com.
Mozilla.ai releases universal interface to LLMs 5 Nov 2025, 1:46 pm
Mozilla.ai, a company backed by the Mozilla Foundation, has released any-llm v1.0, an open-source Python library that provides a single interface to communicate with different large language model (LLM) providers.
any-llm 1.0 was released November 4 and is available on GitHub. With any-llm, developers can use any model, cloud or local, without rewriting a stack every time. This means less boilerplate code, fewer integration headaches, and more flexibility to pick what works best for the developer, Nathan Brake, machine learning engineer at Mozilla.ai, wrote in a blog post. “We wanted to make it easy for developers to use any large language model without being locked into a single provider,” Brake wrote.
Mozilla.ai initially introduced any-llm on July 24. The 1.0 release has a stable, consistent API surface, async-first APIs, and re-usable client connections for high-throughput and streaming use cases, Brake wrote. Clear deprecation and experimental notices are provided to avoid surprises when API changes may occur.
The any-llm v1.0 release adds the following capabilities:
- Improved test coverage for stability and reliability
- Responses API support
- A List Models API to programmatically query supported models per provider
- Re-usable client connections for better performance
- Standardized reasoning output across all models, allowing users to access LLM reasoning results regardless of the provider chosen
- Auto-updating of the provider compatibility matrix, which shows which features are supported by which providers
Future plans call for support for native batch completions, support for new providers, and deeper integrations inside of the company’s other “any-suite” libraries including any-guardrail, any-agent, and mcpd.
How multi-agent collaboration is redefining real-world problem solving 5 Nov 2025, 1:09 am
When I first started working with multi-agent collaboration (MAC) systems, they felt like something out of science fiction. It’s a group of autonomous digital entities that negotiate, share context, and solve problems together. Over the past year, MAC has begun to take practical shape, with applications in multiple real-world problems, including climate-adaptive agriculture, supply chain management, and disaster management. It’s slowly emerging as one of the most promising architectural patterns for addressing complex and distributed challenges in the real world.
In simple terms, MAC systems consist of multiple intelligent agents, each designed to perform specific tasks, that coordinate through shared protocols or goals. Instead of one large model trying to understand and solve everything, MAC systems decompose work into specialized parts, with agents communicating and adapting dynamically.
Traditional AI architectures often operate in isolation, relying on predefined models. While powerful, they tend to break down when confronted with unpredictable or multi-domain complexity. For example, a single model trained to forecast supply chain delays might perform well under stable conditions, but it often falters when faced with situations like simultaneous shocks, logistics breakdowns or policy changes. In contrast, multi-agent collaboration distributes intelligence. Agents are specialized units on the ground responsible for analysis or action, while a “supervisor” or “orchestrator” coordinates their output. In enterprise terms, these are autonomous components collaborating through defined interfaces.
The Amazon Bedrock platform is one of the few early commercial examples that provide multi-agent collaboration capability. It consists of a supervisor agent that breaks down a complex user request — say, “optimizing a retail forecast” — into sub-tasks for domain-specific agents to action, such as data retrieval, model selection and synthesis.
This decomposition helps improve decision-making accuracy and, at the same time, provides more transparency and control. At the protocol layer, standards like Google’s Agent-to-Agent (A2A) and Anthropic’s Model Context Protocol (MCP) define how agents discover and communicate across environments. Think of them as the TCP/IP of collaborative AI, enabling agents built by different organizations or using different models to work together safely and efficiently.
The architecture of multi-agent collaboration
Solving global real-world problems requires architectures that can maintain a balance between autonomy, communication and oversight. In my experience, designing such a system on a high level requires following four interoperable layers:
1. Agent layer: Specialization
This layer contains individual agents, each having a dedicated role such as prediction, allocation, logistics or regulation. Agents can be fine-tuned LLMs, symbolic planners or hybrid models wrapped in domain-specific APIs. This modularity mirrors microservice design: loosely coupled, highly cohesive.
2. Coordination layer: Orchestration
This layer is known as the nervous system, responsible for keeping agents connected with each other. Agents exchange intents instead of raw data using A2A, MCP or custom message brokers (e.g., Kafka, Pulsar). The orchestration layer routes these intents between agents, resolves conflicts and aligns timing. It can support different topologies, including centralized, peer-to-peer or hierarchical, depending on latency and trust requirements.
3. Knowledge layer: Shared context
This layer provides memory for the agents, a shared context store, typically a vector database (e.g., Weaviate, Pinecone) combined with a graph database (e.g., Neo4j), that maintains world state: facts, commitments, dependencies and outcomes. This persistent memory ensures continuity across events and agents.
4. Governance layer: Oversight and trust
This layer provides governance through policy enforcement, decision audits and human involvement for ad hoc inspection/checkpoints. In addition, it manages authentication, explainability and ensures decisions remain within legal and ethical bounds.
Multi-agent collaboration in action
The real excitement around multi-agent collaboration isn’t confined to cloud platforms or developer sandboxes. It’s happening in the physical and environmental systems that sustain our world.
Climate-adaptive agriculture: Agents for a living planet
Nowhere have I found this shift more urgent or inspiring than in climate-adaptive agriculture. Today, Farmers are confronting growing uncertainty in rainfall, soil health and temperature variability. Centralized AI models can provide useful insights, but they rarely adapt fast to localized changes.
In contrast, a multi-agent ecosystem can coordinate real-time sensing, forecasting and action across distributed farms:
- Sensor agents monitor soil moisture and nutrient data.
- Weather agents pull localized forecasts and detect anomalies.
- Irrigation agents decide watering schedules, negotiating water allocation with regional policy agents.
- Market agents adjust planting and distribution strategies based on demand and logistics.
In precision agriculture projects, I’ve researched how farmers using multi-agent systems that integrate aerial drones with ground robots have reported crop yield increases of up to 10%, while simultaneously reducing input costs. That’s not a theoretical projection — it’s happening on working farms right now.
Here’s how it works in practice: UAVs (drones) survey fields from above, identifying problem areas and monitoring crop health across hundreds of acres. Meanwhile, ground-based robots handle targeted interventions like precise irrigation, fertilizer application or pest management. The key is that these agents communicate and coordinate. When a sensor detects a sudden increase in soil moisture in one area, the irrigation system automatically adjusts to prevent overwatering. No human intervention or central command center is required for making all the decisions.
Supply chain collaboration under pressure
The global supply chain is another proving ground for MAC. A single bottleneck, whether caused by weather, labor strikes or geopolitical tension, can ripple across continents. Multi-agent systems provide a way to detect, simulate and respond to those disruptions faster than traditional analytics pipelines.
Multi-agent systems in supply chains involve networks of AI-powered agents that work together autonomously, making the supply chain smarter, faster and more resilient. The beauty of these systems lies in their autonomy and flexibility, where each agent can make decisions within its realm while communicating and collaborating to achieve overarching goals.
Here’s how I’ve found collaboration plays out in practice:
- In demand forecasting, one agent might analyze social media trends while another examines economic indicators. Working together, they create a more accurate forecast.
- For inventory management, an agent monitoring sales trends can instantly communicate with another controlling reordering to ensure optimal stock levels.
- In logistics optimization, one agent plans the best truck routes while another monitors traffic conditions and if a road closure occurs, the agents can quickly recalculate and reroute in real time.
The integration creates a digital nervous system for supply chains, enabling unprecedented levels of coordination and efficiency, with companies reporting an average 15% reduction in overall supply chain costs. The systems provide enhanced end-to-end visibility, improved demand forecasting accuracy, reduced planning costs by over 25%, increased agility in responding to market fluctuations and optimized inventory management.
Multi-agent disaster management systems
The same principles of distributed intelligence are also redefining disaster management. In these high-stakes environments, I’ve found that coordination and adaptability can mean the difference between life and death.
When I first began exploring multi-agent disaster response systems, I was struck by how they function like a digital ecosystem of autonomous specialists. Each agent representing rescue workers, evacuees or information hubs, acts independently but coordinates through shared situational awareness. By processing data and executing localized decisions in parallel, Multi-agent systems dramatically reduce response latency and improve resilience in uncertain environments.
In simulated evacuations, for instance, each virtual evacuee is modeled as an agent with unique physical and psychological attributes such as age, health and stress level that evolve in real time. The emergent behavior that arises from thousands of these agents interacting offers critical insights into crowd dynamics and evacuation strategies that static models could never capture.
Lessons for system architects
Architecting multi-agent ecosystems demands new design heuristics:
- Design for negotiation, not command. Replace schedulers with protocols where agents bargain over shared goals.
- Treat memory as infrastructure. Context persistence is as critical as compute.
- Embed governance early. Auditing and policy hooks must be first-class citizens.
- Prioritize modular onboarding. Use schemas and APIs that allow new agents to join with minimal friction.
In this paradigm, coordination becomes a first-order system capability. Future cloud platforms will likely evolve to provide “cooperation primitives ” — built-in support for intent passing, conflict arbitration and collective state management.
The road ahead: Standards, security and trust
Like any emerging paradigm, MAC comes with its share of unanswered questions. How do we keep agents aligned when they act semi-autonomously? Who defines their access rights and goals? And what happens when two agents disagree?
Early standards such as the Model Context Protocol (MCP) and Agent-to-Agent (A2A) are beginning to shape the answers. They make it possible for agents to communicate securely, share context and discover one another in permissioned ways. But technology alone won’t solve the deeper challenges. Organizations will also need governance frameworks, clear rules for delegation, auditing and alignment, to prevent “agent sprawl” as systems scale.
In practice, the most successful MAC pilots typically start small with a few agents automating tasks such as data triage or workflow handoffs. Over time, it evolves into full-fledged ecosystems where collaboration between agents feels as natural as calling an API.
That evolution, however, comes with new responsibilities:
- Balancing goals: When agents have conflicting goals, for example, one trying to maximize yield while another aims to minimize emissions, they need a way to resolve those differences through arbitration models that balance fairness with efficiency.
- Securing the network: A single malicious or compromised agent could distort results or spread misinformation. Robust identity and trust management are non-negotiable.
- Building transparency: For high-impact systems, humans must be able to trace why an agent made a decision. Clear logs and language-level reasoning trails make that possible.
- Testing at scale: Before deployment, thousands of agents need to be stress-tested in realistic environments. Tools like MechAgents and SIMA are paving the way here.
Ultimately, the future of multi-agent collaboration will depend not just on smarter technology but on how well we design for trust, transparency and responsible governance. The organizations that get this balance right will be the ones that turn MAC from a promising experiment into a lasting advantage.
A change in how we think about intelligence itself
Multi-agent collaboration represents a transformational shift from building smarter models to building smarter networks. It’s a change in how we think about intelligence itself; it is not a single entity, but as a collection of cooperating minds, each contributing a piece of situational understanding.
As someone who has spent years in enterprise systems, I find that deeply human. We thrive not as isolated experts but as collaborators, each with a unique role and perspective. The same principle is now shaping the next generation of AI. Whether we’re managing crops, supply chains or disasters, the path forward looks less like command-and-control and more like conversation.
This article is published as part of the Foundry Expert Contributor Network.
Want to join?
Some thoughts on AI and coding 5 Nov 2025, 1:00 am
Holy cow, things are moving fast in the software business—ideas are coming like an oil gusher, and keeping up is a challenge. Here are a few observations, amazements, puzzlements, and prognostications regarding AI and software development that have occurred to me recently.
Vibe coding for the win
Vibe coding has come a long way in the six months since I first tried it. I recently picked up the same project that I (or rather Claude Code) had built, and I was shocked at how much better the agent was. In the first go round, I had to keep a close eye on things to make sure that the agent didn’t go off the rails. But this time? It pretty much did everything right the first time. I’m still stunned. I suspect it’s going to be a while before the amazement wears off.
One thing that is absolutely fantastic about vibe coding is debugging. Those cryptic error messages that take human programmers minutes or sometimes hours to run down can be deciphered and debugged by AI in seconds. I’m now at the point where I don’t even ask the agent about the error message; I just enter it, and it automatically identifies the problem. A great example: package dependency hell. No human can decipher the deep dependency chains created by our applications today, but AI can untangle—and fix!—these issues without missing a beat.
I expect we will see an explosion of what might be called “boutique software” as a result of vibe coding. There are endless ideas for websites and mobile apps that never got written or created because the cost to produce them outweighed the benefits they promised. But if the cost of producing them is drastically reduced, then that cost/benefit ratio becomes viable, and those small but great ideas will come to fruition. Prepare for “short form software,” similar to what TikTok did for content producers.
Software development is uniquely positioned to take advantage of AI agents. Large language models (LLMs) are—no surprise—based on text. They take text as input and produce text as output. Given that code is all text, LLMs are particularly good at producing code. And because computer code isn’t particularly nuanced compared to spoken language, AI easily learns from existing code and thus excels at producing code. It’s a virtuous cycle.
Software development futures
The previous point creates a dilemma of sorts. Up until now, humans have written all the code that LLMs train on. As humans write less and less code, what will the LLMs be trained on? Will they learn from their own code? I’m guessing what will happen is that humans will continue to design the building blocks—components, libraries, and frameworks—and LLMs will “riff” off of the scaffolding that humans create. Of course, it may be that at some point AI will be able to learn from itself, and we humans will merely describe what we want and get it without worrying about the code at all.
It seems kind of nuts to put limits on what AI can do in coding and software engineering. “We’ll always need software developers” is easy to say, but frankly, I’m not so sure it is true. I suppose it was easy to say “We’ll always need farmers” or “We’ll always need autoworkers.” Although both of those statements are still true, there are a lot fewer farmers and autoworkers today than there were decades ago. I suppose there will always be a need for software developers—the question is how many.
Hidden Figures is a beautiful movie about a group of Black women who were instrumental in getting the early US space program off the ground. They were called “computers” because they literally computed trajectories, landing coordinates, and all the precise calculations needed to safely conduct space flight. They did heroic and admirable work. But today, all of those calculations can be done with a Google spreadsheet. I think that AI is going to do to software developers what HP calculators did to human computers.
At this point, the only thing I can predict is that no one has a clue where software development is headed. AI is such a strong catalytic force that no one knows what will happen next week, much less next month or next year. Whatever does happen, it is going to be amazing.
AI and machine learning outside of Python 5 Nov 2025, 1:00 am
Name a language used for machine learning and artificial intelligence. The first one that comes to mind is probably Python, and you wouldn’t be wrong for thinking that. But what about the other big-league programming languages?
C++ is used to create many of the libraries Python draws on, so its presence in AI/ML is established. But what about Java, Rust, Go, and C#/.NET? All have a major presence in the enterprise programming world; shouldn’t they also have a role in AI and machine learning?
Java
In some ways, Java was the key language for machine learning and AI before Python stole its crown. Important pieces of the data science ecosystem, like Apache Spark, started out in the Java universe. Spark pushed the limits of what Java could do, and newer projects continue to expand on that. One example is the Apache Flink stream-processing system, which includes AI model management features.
The Java universe—meaning the language, the JVM, and its ecosystem (including other JVM languages like Kotlin)—provides a solid foundation for writing machine learning and AI libraries. Java’s strong typing and the speed of the JVM mean native Java applications don’t need to call out to libraries in other languages to achieve good performance.
Java-native machine learning and AI libraries exist, and they’re used at every level of the AI/ML stack. Those familiar with the Spring ecosystem, for instance, can use Spring AI to write apps that use AI models. Apache Spark users can plug into the Apache Spark MLib layer to do machine learning at scale. And libraries like GPULlama3 support using GPU-accelerated computation—a key component of machine learning—in Java.
The one major drawback to using Java for machine learning—shared with most other languages profiled here—is its relatively slow edit-compile-run cycle. That limitation makes Java a poor choice for running experiments, but it’s a prime choice for building libraries and inference infrastructure.
Rust
Despite Rust’s relative youth compared to Java (Rust is just 13 years old compared to Java’s 30), Rust has made huge inroads across the development world. Rust’s most touted features—machine-native speed, memory safety, and its strong type system—provide a solid foundation for writing robust data science tools.
Odds are any work you’ve done in the data science field by now has used at least one Rust-powered tool. An example is the Polars library, a dataframe system with wrappers for multiple languages. A culture of Rust-native machine learning and data science tools (tools meant to be used in the Rust ecosystem and not just exported elsewhere) has also started to take shape over the last few years.
Some of the projects in that field echo popular tools in other languages, such as ndarray, a NumPy-like array processing library. Others, like tract, are for performing inference on ONNX or NNEF models. And others are meant to be first-class building blocks for doing machine learning on Rust. For instance, burn is a deep learning framework that leverages Rust’s performance, safety, and compile-time optimizations to generate models that are optimized for any back end.
Rust’s biggest drawback when used for machine learning or AI is the same as Java’s: Compile times for Rust aren’t trivial, and large projects can take a while to build. In Rust, that issue is further exacerbated by the large dependency chains that can accumulate in its projects. That all makes doing casual AI/ML experiments in Rust difficult. Like Java, Rust is probably best used for building the libraries and back ends (i.e., infrastructure and services) rather than for running AI/ML experiments themselves.
Go
At a glance, the Go language has a major advantage over Rust and Java when it comes to machine learning and AI: Go compiles and runs with the speed and smoothness you expect from an interpreted language, making it far more ideal as a playground for running experiments.
Where Go falls short is in the general state of its libraries and culture for such tasks. Back in 2023, data scientist Sooter Saalu offered a rundown on Go for machine learning. As he noted, Go had some native machine-learning resources, but lacked robust support for CUDA bindings and had poor math and stats libraries compared to Python or R.
As of 2025, the picture isn’t much improved, with most of the high-level libraries for AI/ML in Go currently languishing. Golearn, one of the more widely used deep-learning libraries for Go, has not been updated in three years. Likewise, Gorgonia, which aims for the same spaces as Theano and TensorFlow, hasn’t been updated in about the same time frame. SpaGO, an NLP library, was deprecated by its author in favor of Rust’s Candle project.
This state of affairs reflects Go’s overall consolidation around network services, infrastructure, and command-line utilities, rather than tasks like machine learning. Currently, Go appears to be most useful for tasks like serving predictions on existing models, or working with third-party AI APIs, rather than building AI/ML solutions as such.
C# and .NET
Over the years, Microsoft’s C# language and its underlying .NET runtime have been consistently updated to reflect the changing needs of its enterprise audience. Machine learning and generative AI are among the latest use cases to join that list. Released in 2024, .NET 9 promised expanded .NET libraries and tooling for AI/ML. A key feature there, Microsoft’s Semantic Kernel SDK, is a C# tool for working with Microsoft’s Azure OpenAI services, using natural language inputs and outputs.
Other implementations of the Semantic Kernel exist, including a Python edition, but the .NET incarnation plays nice (and natively) with other .NET 9 AI/ML additions—such as C# abstractions and new primitive types for working with or building large language models. One example, the VectorData abstraction, is for working with data types commonly used to build or serve AI/ML models. The idea here is to have types in C# itself that closely match the kind of work done for those jobs, rather than third-party additions or higher-level abstractions. Other Microsoft-sourced .NET libraries aid with related functions, like evaluating the outputs of LLMs.
The major issue with using C# and .NET for AI/ML development is the overall lack of adoption by developers who aren’t already invested in the C#/.NET ecosystem. Few, if any, developer surveys list C# or other .NET languages as having significant uptake for AI/ML. In other words, C#/.NET’s AI/ML support seems chiefly consumed by existing .NET applications and services, rather than as part of any broader use case.
Conclusion
It’s hard to dislodge Python’s dominance in the AI/ML space, and not just because of its incumbency. Python’s convenience, along with its richness of utility and broad culture of software, all add up.
Other languages can still be key players in the machine learning and AI space; in fact, they already are. Spark and similar Java-based technologies empower a range of AI/ML tools that rely on the JVM ecosystem. Likewise, C# and the .NET runtime remain enterprise stalwarts, with their own expanding subset of AI/ML-themed native libraries and capabilities. Rust’s correctness and speed make it well-suited to writing libraries used throughout both its own ecosystem and others. And Go’s popularity for networking and services applications makes it well-suited for providing connectivity and serving model predictions, even if it isn’t ideal for writing AI/ML apps.
While none of these languages is currently used for the bulk of day-to-day experimental coding, where Python is the most common choice, each still has a role to play in the evolution of AI and machine learning.
Page processed in 0.043 seconds.
Powered by SimplePie 1.3, Build 20180209064251. Run the SimplePie Compatibility Test. SimplePie is © 2004–2025, Ryan Parman and Geoffrey Sneddon, and licensed under the BSD License.
