Meta shows structured prompts can make LLMs more reliable for code review 1 Apr 2026, 3:22 am

Meta researchers have developed a structured prompting technique that enables LLMs to verify code patches without executing them, achieving up to 93% accuracy in tests.

The method, dubbed semi-formal reasoning, could help reduce reliance on the resource-heavy sandbox environments currently required for automated code validation.

The development comes as organizations look to deploy agentic AI for repository-scale tasks like bug detection and patch validation. Traditional execution-based approaches often struggle to scale across large, heterogeneous codebases.

Instead of using free-form reasoning that can lead to hallucinations, the technique introduces structured logical certificates. These require models to explicitly state assumptions and trace execution paths before deriving a conclusion.

The researchers evaluated the approach across three key tasks, including patch equivalence verification, fault localization, and code question answering, and found that semi-formal reasoning improved accuracy across all of them.

“For patch equivalence, accuracy improves from 78% to 88% on curated examples and reaches 93% on real-world agent-generated patches, approaching the reliability needed for execution-free RL reward signals,” the researchers said in the paper.

For code question answering, semi-formal reasoning reaches 87% accuracy, marking a nine-percentage point improvement over standard agentic reasoning. In fault localization, it boosts Top 5 accuracy by five percentage points compared to standard approaches.

How it works

Semi-formal reasoning occupies a middle ground between unstructured chat and rigid formal verification. While standard reasoning allows models to make claims without justification, this approach uses a predefined template that mandates a step-by-step process.

“Rather than training specialized models or formalizing semantics, we prompt agents with structured reasoning templates that require explicit evidence for each claim,” the researchers said.

They added that the “templates act as certificates: the agent must state premises, trace relevant code paths, and provide formal conclusions. The structured format naturally encourages interprocedural reasoning, as tracing program paths requires the agent to follow function calls rather than guess their behavior.”

In practice, this forces the model to behave like a developer stepping through code line by line.

Researchers said that in one case involving the Django framework, the structured approach revealed that a module-level function shadowed Python’s built-in format() function. While standard reasoning missed this nuance, the semi-formal analysis correctly identified that the code would fail.

Implications for enterprises

Analysts said semi-formal reasoning signals a shift from assistive AI to more accountable AI in software engineering, a distinction that could reshape how enterprises approach code review.

“Tools like GitHub Copilot have conditioned developers to interact with AI as a fast, fluent suggestion engine,” said Sanchit Vir Gogia, chief analyst at Greyhound Research. “You ask, it generates, you accept or tweak. The system optimizes for speed and plausibility. What it does not optimize for is proof.”

Semi-formal reasoning changes that dynamic. Instead of rewarding models for sounding correct, it requires them to demonstrate correctness by tracing logic and grounding conclusions. For developers, this shifts the focus from reviewing outputs to evaluating the reasoning behind them.

“The deeper implication is that code review itself starts to evolve,” Gogia said. “Historically, code review has been a human bottleneck tied to knowledge transfer and design validation as much as bug detection. In practice, it often fails to catch critical issues while slowing down integration. What we are seeing now is the early shape of a machine-led verification layer where the system traces logic and the human validates the outcome.”

The shift, however, is not without tradeoffs. Structured reasoning introduces additional compute and workflow overhead, raising questions about how it should be deployed in real-world development environments.  

“More steps, more tokens, more latency,” Gogia said. “In controlled experiments, this can be justified by higher accuracy. In real developer environments, this translates into slower builds, longer feedback cycles, and increased infrastructure spend. If this is applied indiscriminately, developers will bypass it. Not because they disagree with it, but because it gets in the way.”

There is also a technical risk. The researchers noted that while the structured format reduces guessing, it can also produce “confident but wrong” answers. In these cases, the AI constructs an elaborate but incomplete reasoning chain, packaging an incorrect conclusion in a convincing, highly structured format that may be difficult for a human to quickly debunk.

(image/jpeg; 3.67 MB)

What next for junior developers? 1 Apr 2026, 2:00 am

Everyone is worried about junior developers. What are all these fresh-faced computer science graduates going to do now that AI is writing all the code?  

It is a legitimate concern. 

It wasn’t that long ago that the best advice I could give an early-career person interested in software development was to go to a boot camp. Sure, they could go to college and get a four-year computer science degree, but that would be expensive, take a long time, and teach them a lot of theoretical but impractical things about computers. And they wouldn’t even be doing science. 

But a six-month boot camp? There they’d learn what they really need to know—what software development companies are really looking for. They’d learn practical coding techniques, proper bug management, design specifications, JavaScript and TypeScript, source control management, and continuous integration.  

When I was a hiring manager, it didn’t take long for me to realize that a boot camp graduate was often much more ready to hit the ground running as a junior developer than a computer science graduate. 

But of course, all that fell apart overnight. Suddenly, for a low monthly payment, I could have a tireless, diligent, eager, and highly skilled junior developer who can type a thousand words a minute and reason at the speed of light. The economics of that are simply too compelling. 

Juniors begat seniors

And so what is a budding software developer to do? Or more importantly, what is a software development company to do when they realize that all those senior developers who are using Cursor are actually going to retire one day?  

Up until about 10 minutes ago, those companies would hire these intrepid young whippersnappers and put them to work fixing bugs, writing the boring code that builds systems, and slowly but surely teaching them how systems work by having them learn by doing. One became a senior developer through the experience of writing code, seeing it run, and learning what works and what doesn’t. Eventually, wisdom would set in, and they’d become sage, seasoned developers ready to mentor the next generation of developers.  

Well, we are now skipping that part where you actually become wise. But wisdom is actually the critical thing in this grand process. The judgment to know what is good, what is effective, and what is needed is the very commodity that makes agentic coding work. The AI model writes the code, and we seasoned veterans determine if it is right or not. 

We seasoned veterans know if the code is right or not because we’ve written tons and tons of code. But humans aren’t writing tons and tons of code anymore. And here is where I’m going to say something that I think many of you will really not like: Code doesn’t matter anymore. 

What I mean is, code is a commodity now. Code that used to take months to produce can now be produced in minutes. Yes, literally minutes. And the coding agents today are the worst they will ever be. They are only getting better, and they will only produce cleaner and cleaner code as time marches on. At some point—and that point may already be here for many of you—we are just going to stop looking at code. 

What matters is whether or not the application, you know, actually works. And if you want Claude Code or Codex to write a working application for you, you need to be able to communicate with it effectively to get it to do what you want. And strangely, the way to communicate with it is to write clearly. 

Heads up, English majors

A couple of weeks ago, I wrote that Markdown is the new programming language, and that what makes for “good code” in Markdown is the ability to write clear and concise instructions. Who would have thought that the English department would suddenly be the key to developing good software? 

Right now, the agentic coding process goes something like: 

  1. Describe the problem to Claude Code.
  2. Monitor the code Claude writes to make sure it is good code.
  3. Test the application to make sure it works correctly.
  4. Refine and improve by iterating this process. 

Step 2? It’s already becoming unnecessary. These AI agents are already writing good code, and the code they write gets better and better every day. And it is trivial to tell them to improve the code that they have already written. Iterating to improve code quality takes mere minutes. Writing the code has literally become the easiest part of developing software. 

So my advice to the kids these days: Learn to write clearly and precisely. Learn how to understand systems and describe them and their use cases. Make sure you can succinctly describe what you need software to do. English majors take note. Hiring managers? You too.

(image/jpeg; 5.69 MB)

PEP 816: How Python is getting serious about Wasm 1 Apr 2026, 2:00 am

WebAssembly, or Wasm, provides a standard way to deliver compact, binary-format applications that can run in the browser. Wasm is also designed to run at or near machine-native speeds. Developers can write code in one of the various languages that compile to Wasm as a target (e.g., Rust), and deliver that program anywhere Wasm runs.

But Wasm by itself isn’t enough. An application, especially one running in a browser, needs standardized and controllable ways to talk to the rest of the system. The WebAssembly specification doesn’t speak to any of that by design. It only describes the WebAssembly instruction set; not how programs using those instructions deal with the rest of the system.

That’s what the WASI standard provides—abstractions for using the host system, such as how to perform network and storage I/O, and using host resources like clocks or sources of entropy for PRNGs.

Until now, CPython has supported WASI, but not in a formally defined way. Nothing described how CPython would support versions of WASI (the spec), or the WASI SDK (an implementation of the spec). With PEP 816, the CPython team has formally defined how to support both the spec and the SDK going forward.

Ultimately, the new definition will make it easier to deliver Python apps in the browser or anywhere else Wasm runs. There are just a few things developers need to know to ensure they’re using Wasm correctly with Python under the new rules.

How Python has historically used Wasm

Most languages, such as Rust, compile to Wasm as a binary target. Because Python is interpreted—at least, the default CPython implementation works that way—it doesn’t compile to Wasm directly. Instead, the interpreter itself is compiled to Wasm, and Python programs are run on that Wasm version of the interpreter.

There are drawbacks to this approach. For one, it means you need a full copy of the interpreter and the standard library to run any Python program. There is as yet no mechanism to compile a Python program for Wasm that would either include a copy of the interpreter or make it self-contained.

Another big drawback: Any modules not written in pure Python can’t run in Wasm unless a Wasm-specific version of that module is compiled ahead of time. Unless you have a specially compiled version of, say, NumPy, you can’t use that module in Wasm.

Some of these issues are limitations of Python as a language. Its inherent dynamism makes it difficult to deploy a standalone program. Rust, by contrast, can compile to a single binary artifact for any supported target.

But some of these limits can also be attributed to the Wasm environment. For instance, many methods in the standard library aren’t available in Wasm enviroments because the WASI SDK doesn’t expose the needed interfaces for those methods. The more Python and other languages demand such things, the more likely they are to show up in the Wasm environment.

This is where it is useful for Python to be explicit about which versions it’ll use for both Wasm and its software development kit (or SDK) going forward. Each version of Python can then provide better guarantees about the Wasm features it supports.

Wasm support in Python: WASI and the WASI SDK

Wasm support involves two things: WASI and the WASI SDK. The difference between the two is a little like the difference between the Python language in the abstract and the CPython runtime. The former (WASI) is the spec for how Wasm programs interact with the host system, which can be implemented any number of ways. The latter (the WASI SDK) is the official implementation of that spec.

The WASI SDK is a modified version of the Clang compiler, which uses a library called wasi-libc. This gives programs written in C (and C API-compatible languages) access to WASI’s APIs for the host (storage, networking, timers, etc).

In theory, we should just be able to compile a given CPython release with the most recent WASI SDK at the time. But things aren’t that simple. For one, the SDK’s biggest component, wasi-libc, doesn’t guarantee it’ll be forward- or backward-compatible. Also, some versions of the SDK may cause buggy behavior with some versions of CPython. As developers, we want to know that this version of CPython works with this version of the SDK—or at least be able to document which bugs appear with any given combination of the two.

How future releases of CPython will use WASI

CPython has been available on Wasm since version 3.11, with Tier 2 and Tier 3 support. The more official wasip1 is the better-supported target, while the older emscripten standard is the less-supported version. But Tier 2 support has been confined to the WASI “Preview 1” set of system calls. And for the reasons already stated, the WASI SDK CPython uses is not necessarily the most recent version, either: it’s SDK version 21 for Python 3.11 and 3.12, and SDK version 24 for 3.13 and 3.14.

All of this will change with future releases of CPython, with a couple of hard rules in place for using WASI and its SDK:

  1. Any version of WASI or the WASI SDK supported by a given CPython version by its beta 1 release will be the version supported for the lifetime of that CPython release. For instance, if CPython 3.15 uses version 0.3 of the WASI spec and version 33 of the SDK (these are arbitrary numbers), then that version of WASI and the SDK will be supported for that version of CPython until it is formally sunsetted.
  2. Any changes to the version of the WASI spec or SDK used for a particular release requires approval from Python’s steering council. But this shouldn’t happen outside of some extraordinary set of circumstances—for instance, if a bug surfaced that made a given version of the SDK unusable with a given CPython release.

The benefits of WASI version guarantees for CPython

Going forward, developers can look forward to significant improvements to how Python will work with WASI:

  1. It won’t only be easier for CPython developers to know which versions of WASI and the SDK to target. It will also be easier for the rest of the WASI ecosystem to determine which Python versions are compatible with various WASI and SDK editions.
  2. Developers maintaining Python libraries with extension modules will have a better idea of how to compile those modules to Wasm for each Python point release. They will then be able to take advantage of newer WASI features sooner, knowing that a specific CPython will support them.
  3. Developers can add WASI support to their projects for a given version of CPython sooner in each release cycle for the interpreter, as the WASI and SDK versions should be locked down by the first beta release.

(image/jpeg; 0.13 MB)

Enterprise Spotlight: Setting the 2026 IT agenda 31 Mar 2026, 11:25 pm

IT leaders are setting their operations strategies for 2026 with an eye toward agility, flexibility, and tangible business results. 

Download the January 2026 issue of the Enterprise Spotlight from the editors of CIO, Computerworld, CSO, InfoWorld, and Network World and learn about the trends and technologies that will drive the IT agenda in the year ahead.

(image/jpeg; 2.07 MB)

Anthropic employee error exposes Claude Code source 31 Mar 2026, 7:14 pm

An Anthropic employee accidentally exposed the entire proprietary source code for its AI programming tool, Claude Code, by including a source map file in a version of the tool posted on Anthropic’s open npm registry account, a risky mistake, says an AI expert.

“A compromised source map is a security risk,” said US-based cybersecurity and AI expert Joseph Steinberg. “A hacker can use a source map to reconstruct the original source code and [see] how it works. Any secrets within that code – if someone coded in an API key, for example – is at risk, as is all of the logic. And any vulnerabilities found in the logic could become clear to the hacker who can then exploit the vulnerabilities.”

However, Anthropic spokesperson told CSO, “no sensitive customer data or credentials were involved or exposed. This was a release packaging issue caused by human error, not a security breach. We’re rolling out measures to prevent this from happening again.”

But it wasn’t the first time this had happened; according to Fortune and other news sources, the same thing happened last month.

Don’t expose .map files

Map files shouldn’t be left in the final version of code published on open source registries, where anyone can download a package; they can be sources of useful information for hackers.

According to developer Kuber Mehta, who published a blog on the latest incident, when someone publishes a JavaScript/TypeScript package to npm, the build toolchain often generates source map files (.map files). These files are a bridge between the minified/bundled production code and the original source; they exist so that when something crashes in production, the stack trace can point to the actual line of code in the original file, not to some unintelligible reference.

What’s available in these files? “Every file. Every comment. Every internal constant. Every system prompt. All of it, sitting right there in a JSON file that npm happily serves to anyone who runs npm pack or even just browses the package contents,” said Mehta.

“The mistake is almost always the same: someone forgets to add *.map to their .npmignore or doesn’t configure their bundler to skip source map generation for production builds,” Mehta said. “With Bun’s bundler (which Claude Code uses), source maps are generated by default unless you explicitly turn them off.”

Think of a source map as a file that shows what parts of minified computer code, which is not easily understandable to humans, are doing, shown in the human-readable source code, said Steinberg. For example, he said, it may indicate that the code in a specific portion of the executable code is performing the instructions that appear in some specific snippet of source code.

A source map can help with debugging, he added. Without it, he said, many errors would be identified as coming from a larger portion of code, rather than showing exactly where the errors occur.

The world learned of this incident when security researcher Chaofan Shou posted this message early Tuesday on X: “Claude code source code has been leaked via a map file in their npm registry!”, along with a link to the file.

A common error

Leaving source map files in a package “is an incredibly common mistake developers make quite often,” said secure coding trainer Tanya Janca. “In this specific situation, it is more serious than it would be somewhere else, mostly because of the incredibly high value of the intellectual property involved, and because now malicious actors can analyze the source code directly for vulnerabilities instead of having to reverse engineer it, which adds time, cost, and complexity.”

Ideally, Janca said, developers should harden their build environment, so they don’t ship debug information/features with production. She offered these tips to developers:

  • disable source maps in the build/bundler tool;
  • add the .maps file to the .npmignore / package.json files field to explicitly exclude it, even if it was generated during the build by accident;
  • exclude the .maps files from the list of published artifacts in the continuous integration/continuous deployment environment;
  • carefully separate debug builds from production builds if there are differences; even the comments could be incredibly sensitive.

A critical layer

Any exposure of source code or system-level logic is significant, because it shows how controls are implemented, commented Dan Schiappa, president of technology and services at Arctic Wolf. With this information exposed, the number of people who now understand how the model enforces behavior, manages access, and handles edge cases increases, he said.

“In AI systems, that layer is especially critical,” he added. “The orchestration, prompts, and workflows effectively define how the system operates. If those are exposed, it can make it easier to identify weaknesses or manipulate outcomes. Knowing that attackers are still discovering the most optimal ways to leverage AI means that in any instance where a tool could be compromised, there are likely cybercriminals waiting in the wings.”

(image/jpeg; 11.98 MB)

How to halve Claude output costs with a markdown tweak 31 Mar 2026, 2:58 am

In a quiet corner of GitHub better known for weekend experiments than paradigm shifts, Drona Reddy, a data analyst at Amazon US, has published a single markdown file that promises to cut Claude’s output token usage by more than half, not by changing code, but by reshaping the model’s behavior.

The file, called Claude.md and available under an MIT license, outlines a set of structured instructions that claim to reduce Claude’s output verbosity by about 63% without any code modifications.

These instructions impose strict behavioral constraints on the model, including limits on output length, emphasis on token efficiency and accuracy, controls on speculation, rules for typography, and a zero‑tolerance policy on sycophantic responses. They also simplify code generation and define clear override policies, effectively training the model to respond more concisely and deliberately.

Reducing output tokens

The rationale is straightforward: eliminate what Reddy describes as Claude’s “frivolous” habits, stripping out everything that isn’t strictly necessary. That means no automatic pleasantries like “Sure!” or “Great question!”, no boilerplate sign-offs such as “I hope this helps,” no restating the prompt, and no unsolicited suggestions or over-engineered abstractions.

It also curbs stylistic quirks like “em” dashes, smart quotes, and other Unicode characters that can break parsers, while preventing the model from reflexively agreeing with flawed assumptions.

At scale, that kind of austerity, according to Reddy, could translate into meaningful savings, turning small stylistic trims into outsized efficiency gains.

The data analyst also outlined three distinct use cases where the markdown file could be most effective. First, high-volume automation pipelines, such as resume bots, agent loops, and code generation, where verbosity compounds across repeated calls.
Second, repeated structured tasks, where Claude’s default expansiveness can add up over hundreds of interactions. Third, team environments that require consistent, parseable output formats across sessions, where tighter control over responses improves reliability and downstream usability.

In his own simulations on Claude Sonnet, Reddy said the file could save close to 9,600 tokens a day at 100 prompts, translating to roughly $0.86 in monthly savings. At 1,000 prompts a day, the savings rise to about 96,000 tokens, or $8.64 a month, while across three projects combined, he estimates reductions of nearly 288,000 tokens, equivalent to around $25.92 monthly.

However, the data analyst also warned that the file might be really ineffective, even counterproductive, in certain use cases, such as single one-off queries, fixing deep failures, or exploratory work where feedback is required, as the file itself consumes input tokens on every message.

“The CLAUDE.md file itself consumes input tokens on every message. The savings come from reduced output tokens. The net is only positive when output volume is high enough to offset the persistent input cost. At low usage it costs more than it saves,” Reddy wrote in the repository’s documentation.

Modest enterprise gains

Analysts do see enterprises and their CIOs benefitting from the markdown file, at least to a certain degree, especially as they struggle to balance spiraling inference bills and moving agentic or other AI pilots into production.

“A 63% token reduction can meaningfully lower inference costs and latency for enterprises running high-volume Claude workloads,” said Charlie Dai, principal analyst at Forrester.

The gains, however, may be more operational than transformative.

“For CIOs, this method offers some operational benefits as it improves output consistency, improves latency, and enforces basic token discipline, which can help in scaling automation,” said Pareekh Jain, principal analyst at Pareekh Consulting.

However, Jain pointed out that though this is a “useful tactical optimization”, it does not fundamentally change enterprise AI economics.

“In enterprise settings, the tactic is likely to translate into more modest savings because output tokens are only a portion of total usage as input context, retrieval, and agent orchestration typically dominate costs,” Jain said. “As a result, most enterprises would likely see single-digit savings rather than the headline number,” he added. The markdown file is designed to be model-agnostic and should work across large language models that can follow structured instructions, though Reddy noted he has not tested its effectiveness on local models such as those running on llama.cpp or Mistral.

(image/jpeg; 2.12 MB)

Using Azure Copilot for migration and modernization 31 Mar 2026, 2:00 am

Microsoft has given Azure many hats: a serverless platform for distributed applications, a host for security and identity services, a place for big data, and an alternative to running your own data centers and infrastructure.

It’s this last one that’s often forgotten since much of the thinking about cloud platforms focuses on new tools and technologies instead of the old faithful applications that have been lifted and shifted to the cloud. The lift-and-shift process has become increasingly important as a tool to help get rid of servers and also make some headway against the perennial problem of technical debt.

Solving technical debt

Building new applications is easy, but it’s hard to keep them up-to-date. The budget for maintenance never quite covers the necessary work, so applications and servers keep running until it’s impossible to keep them going. Historically that’s meant aged mainframes running for more than 50 years, or a nuclear power plant being controlled from an emulation running in another emulation on top of a stack of virtual machines.

That’s the extreme end of technical debt—software archaeology rather than software engineering. Day-to-day technical debt is perhaps better thought of as a drag on business, code that doesn’t quite fit the current state of a business process, requiring manual workarounds that slow things down but allow the applications to keep running. There’s associated compounding risk as applications, operating systems, storage, and networks drift away from security baselines, dropping out of support, unpatched because any updates could stop the business from operating.

Migrating to the cloud can help, but too often it’s a matter of simply replicating a physical infrastructure in virtual machines on an IaaS platform. Yes, hyperscale cloud PaaS features could solve problems, as well as using automated updates and upgrades to deploy the latest features and keep applications more secure.

Automating cloud migrations

Microsoft has offered several generations of tools to migrate applications to the cloud, mostly focused on finding the right-size Azure virtual machines and the right virtual network appliances and topography, as well as importing data into cloud storage. Microsoft has provided useful ways to move applications to the cloud, but the company hasn’t helped you modernize code or infrastructure. Enterprises are just replicating technical debt in Azure instead of on their own servers.

Even if you use these tools, a migration can take months of planning and testing, and the disconnect between IT teams focusing on infrastructure and development teams focusing on the code means there’s no shared view of the entire task.

How then can Microsoft break down the barriers between teams and use migration as an opportunity to help modernize software and reduce technical debt?

Azure Copilot for migration and modernization

The latest version of the Azure Copilot recently launched, building on earlier releases and the agent model implemented in the closely related GitHub Copilot. It includes a new migration agent as part of its library of tools, using grounded AI to guide you through a simultaneous process of migrating and upgrading applications. The intent is to speed up the process by using your current environment and the capabilities of the Azure platform to define and implement the IT strategies to deliver a modern cloud infrastructure.

With the Azure Copilot handling infrastructure, the modernization tools in GitHub Copilot’s agents can help work through the necessary steps to update the code, adding support for Azure capabilities and modern cloud-native architectures.

Key to both approaches is a process of fine-tuning and grounding that uses Azure’s well-defined APIs and the constraints of domain-specific, software-defined infrastructure through tools like Bicep, Terraform, and the older but still critical Azure Resource Manager. Defining and implementing an IT strategy is one of the things an AI agent should be good at. Working alongside infrastructure and application architects, the agent defines the current state of an application, the target, where the modernized version will run, and what tools it will need. It then uses spec-driven development methodology to treat that gap as a directed graph that will first define an infrastructure and then update the code.

Having a known state at both ends of the process keeps risk to a minimum, though of course you need to always keep humans in the loop. It’s not a process that can be fully automated; instead, it’s an approach that speeds up tasks and reduces the necessary effort. In one early test, a customer was able to reduce this by 70%.

Working with cooperating agents

Microsoft is using migration as an example of how agents can cooperate and help different disciplines communicate effectively. Reports generated by GitHub Copilot can be used by Azure Copilot to identify possible issues and bridge the gaps between software modernization and migration plans. The resulting insights help teams prioritize necessary work and improve the specifications and strategy that guide the work.

Using the migration agent from Azure Copilot is straightforward, as it builds on existing best practices and processes. However, don’t expect it to support all the possible migration scenarios from day one. The current preview release is designed to help move specific infrastructures to Azure by analyzing data from the existing Azure Migrate tools.

Two key scenarios are supported in this first version: moving VMware infrastructures, and working with existing environments that use Hyper-V and physical servers. In both cases, you’ll need to run the existing Azure Migrate tools to collect the data the agent will use to plan a migration. This will require installing Azure Migrate collectors or the free RVTools utility. Microsoft provides an Azure Migrate appliance that can be deployed inside either VMware or Hyper-V environments (or on bare-metal servers) to handle discovery, which helps gather and process this data.

The migration agent will run discovery for you or work with your own discovery data. Once you have data, you can use prompts to assess your infrastructure, check for servers that need upgrades, and build a plan for a lift-and-shift exercise. You can even get cost analysis and ROI reports. Other options help add modernization options, for example, moving data to Azure servers. Then you can start building the base infrastructure for a migration and start deploying Azure resources.

Agents connect ops and dev teams

A conversational approach to working with the agent through Azure Copilot can help financial and business team members understand the effectiveness of a migration, as they can get access to costs and timescales through familiar tools. System administrators will be able to quickly get the information they need, while development teams will be able to understand what code changes might be needed to support a new infrastructure. Having Azure Copilot as a hub for these conversations can help reduce risks and keep projects on track. The information needed for good decisions is now easily accessible.

At the same time, you can have the GitHub modernization agent from the GitHub CLI update the code you’re running on those servers, using the tool to guide updates to .NET and Java. The agent will analyze code and produce a modernization plan to guide development teams, or it can automate the process of updating and testing your code. The migration agent is designed to look for issues that might arise when migrating to the cloud, so it’s an important component of a suite of migration tools.

With these new AI-powered tools, you’re able to speed up the process of moving complete applications to the cloud, with an ROI assessment, a migration plan, and the necessary updates to what may be outdated code. With new infrastructure and code, you’re able to start dealing with long-term technical debt and adding new features that can improve business performance and offer new services both inside and outside your organization.

(image/jpeg; 8.51 MB)

Enterprises demand cloud value 31 Mar 2026, 2:00 am

The release of the Flexera 2026 State of the Cloud Report provides vital insights into how enterprises are navigating the constantly changing landscape of cloud computing and increasingly AI-driven workloads. The findings show that the enterprise cloud discussion has changed significantly in recent years.

What started as a focus on cost-cutting and simple lift-and-shift migration has evolved into a complex balancing act involving governance, innovation, and measurable business value. This shift is not only logical but also expected, especially considering the rise of AI and the shared experience of enterprises that have faced unexpected cloud costs and unchecked spending for years.

Let’s explore what the report shows, why these findings are expected given industry trends, and what enterprises can do to extract real value from their ongoing—indeed, increasing—cloud investments.

Moving beyond cost-cutting

The main finding from the 2026 State of the Cloud Report is that organizations are clearly moving past an era where “cost-cutting at all costs” was the primary cloud strategy. We see a notable 12% increase in organizations reporting value delivered to business units year over year, even as the emphasis on simple efficiency and savings has decreased by 6%.

This marks a turning point in cloud adoption maturity. For years, cloud evangelists and technology leaders highlighted the cost advantages of moving from on-premises systems to the public cloud, and rightly so. However, the initial enthusiasm faded for many CIOs and CFOs as complexities increased, such as cloud sprawl, unpredictable billing, or wasted resources. According to the report, about 29% of cloud spending is still wasted, a sobering figure for any CFO’s office, but this also marks an improvement compared to years of uncontrolled growth. Now, the focus is on making productive investments and how cloud services can significantly impact business results.

Wiser organizations are trying to understand the cost of each service, align spending with results, and develop the discipline to measure value at every level, from individual projects to large-scale digital transformation efforts.

Oversight and governance models

As cloud complexity increases, so does the need for controls and clear leadership. The State of the Cloud Report shows a growth in the adoption of cloud centers of excellence (CCOEs) and finops teams, with 71% of organizations now using CCOEs. More organizations have dedicated finops teams to advise, manage, and optimize cloud spend. Teams are also more active in overseeing SaaS and cloud software usage.

What unites these efforts is a shift toward centralized accountability. Enterprises have learned the hard way that finops, CCOEs, and asset management cannot function in isolation. Cloud cost management collapses when roles and responsibilities are scattered. The most developed organizations coordinate efforts across teams to maintain shared views on spending, usage, and business priorities.

Some enterprises may worry about adding bureaucracy, but establishing transparency and focus will balance innovation with risk management and connect cloud spending directly to organizational goals.

Opportunities and risks of AI

Perhaps the most important change in this year’s report is the rise in AI-driven workloads, especially the adoption of generative AI. The data is clear: 45% of organizations now use genAI extensively, up from 36% a year earlier. Companies see significant opportunities for innovation and gaining a competitive edge through AI. At the same time, they realize that AI workloads are flexible, can incur unpredictable costs, and are often billed under new consumption-based pricing models that can quickly escalate without strong governance.

While there’s excitement about AI’s potential, there’s also a rapidly growing consensus that strong governance and cost controls must be put in place before spending spirals far beyond projections. We’re starting to see organizations appoint dedicated AI governance leaders who can ensure that innovation is safe, scalable, and firmly tied to business value and operational accountability (unlike the early, chaotic days of cloud).

SaaS spending, too, has exploded, with the most common monthly range among respondents jumping to $200,000 to $500,000. This is a sharp increase from the $50,000 to $100,000-tier that dominated last year, driven largely by the proliferation of AI-powered features and complex, usage-based pricing models. The conclusion: Cloud bills won’t shrink anytime soon, but organizations are determined not to repeat past mistakes. The focus now is on ensuring every dollar has a tangible return.

Three ways to boost cloud value

If you’ve tracked cloud growth as long as I have, these shifts are both natural and expected. Ten years ago, cost savings were the main focus of every cloud business case. As companies moved beyond the initial adoption phase, issues like sprawl, waste, and sticker shock emerged. The rise of AI and its endless demand for compute and data have increased both the opportunities and risks. Enterprises are seeking a balanced approach that carefully considers innovation, costs, and business value.

Cloud cost management is evolving into a smarter, value-driven discipline that’s keeping pace with business priorities and fast-changing markets. Here are three recommendations for enterprises looking to increase value in their cloud services:

  1. Double down on centralized governance. Build or strengthen your CCOE and finops capabilities, but don’t let these teams operate independently. Accountability and transparency must be shared and tied directly to business outcomes.
  2. Bring AI governance into the fold early. Establish clear ownership, set measurable goals, and don’t hesitate to manage experimental projects with the same rigor as production workloads. AI innovation is too expensive and too important to leave to chance.
  3. Treat finops as an ongoing discipline, not a one-time project. Invest in the people, processes, and tools that support continuous tracking, optimization, and reporting. Incorporate cost and value discussions into every cloud project’s life cycle.

The key message from this year’s report is simple: Enterprises are demanding clear value from their cloud and AI initiatives. That’s expected, and it’s exactly where mature organizations should be headed.

(image/jpeg; 0.2 MB)

What front-end engineers need to know about AWS 31 Mar 2026, 2:00 am

Front-end engineers usually think performance problems live in the browser. When a page feels slow, we inspect bundle size and rendering. When something breaks, we open the network tab. If users complain, we optimize components or tweak state management. For a long time, I approached production issues the same way, assuming the root cause had to exist somewhere inside the UI. Over time, however, I started noticing a pattern: many confusing ‘front-end’ problems were not actually caused by front-end code. 

A login flow would occasionally fail and then work on refresh. An API would be slow only the first time. A deployment fix would be live for me, but not for a user. Sometimes, the interface displayed outdated data immediately after release. These issues were not caused by typical JavaScript errors. They were influenced by infrastructure behavior, particularly in environments running on AWS. 

Front-end engineers don’t need to manage servers to be affected by them. Modern web applications are no longer a single application talking to a single server. They sit on top of distributed cloud systems, and those systems influence how a UI behaves. Understanding a few core AWS concepts does not turn a front-end developer into a cloud engineer, but it does make debugging faster and UI design decisions more realistic. 

The hidden gap between front end and the cloud 

Front-end and back-end teams usually interact through a simple contract: an endpoint. The front end receives a URL and consumes data from it. From the UI’s perspective, it is just a request returning JSON. Behind that URL, however, is often a chain of services including gateways, caching layers, routing systems and load balancing. 

Because these layers are invisible, front-end engineers may make assumptions that don’t always match how distributed systems behave. When an API responds slowly, we suspect inefficient code. When requests fail intermittently, we assume unstable networking. When behavior changes between users, we think state handling is incorrect. In practice, many of these behaviors are predictable consequences of the infrastructure itself. 

The result is that UI code frequently compensates for system behavior without understanding it. Developers add unnecessary retries, misleading error messages or extra loading states. Once you recognize how the cloud shapes responses, the behavior stops appearing random and starts appearing explainable. 

How cloud infrastructure changes front-end behavior 

CDN hosting and the “old UI after deployment” problem 

Most modern front ends are deployed as static files. The application is essentially a set of HTML, CSS and JavaScript bundles delivered to the browser. In AWS environments, these files are commonly served through a content delivery network backed by object storage. This improves performance because users receive files from a location geographically close to them rather than from a single centralized server. 

However, that performance improvement comes with caching. After a deployment, some users may still see the previous version of the interface. A hard refresh fixes it, and waiting a short time fixes it as well. This often feels like a failed deployment, but it is expected behavior. The network is doing what it was designed to do: reuse previously downloaded files to improve speed. In practice, this behavior often comes from a combination of CDN edge caching, browser caching and cache headers rather than a single caching layer. 

From a front-end perspective, this changes how releases should be handled. Deployment is no longer only about shipping new code; it is also about ensuring browsers and caching layers request updated files. Versioned filenames and cache-aware design become important front-end concerns. Understanding that the infrastructure intentionally preserves older assets makes these issues predictable instead of mysterious. 

Serverless APIs and the slow first request 

Another behavior front-end engineers commonly observe is that an API request can be unusually slow the first time and normal afterward. This can be confusing because the same endpoint suddenly becomes responsive without any code changes. 

This behavior occurs because the API runs on serverless compute. Instead of a constantly running server, the platform initializes an execution environment only when a request arrives. The initial request includes the startup time required to initialize that environment. Once active, subsequent requests respond quickly. 

For UI design, this distinction matters. A loading state designed around consistent response times may incorrectly display an error or timeout during a normal cold start. Users interpret this as a broken feature even though the system is functioning correctly. Recognizing that occasional long responses are architectural rather than faulty allows front-end developers to design more forgiving loading states and avoid unnecessary failures. Cold starts are infrequent under steady traffic but noticeable in low-traffic or sporadic workloads. 

Understanding this also changes debugging. Not every delay is caused by network speed or inefficient queries. Sometimes the system is simply initializing itself in response to real usage patterns. 

Distributed systems and intermittent failures 

One of the most difficult production issues to investigate is a problem that cannot be reproduced locally. An interface may work consistently for developers but fail for certain users. Requests occasionally return server errors and then succeed moments later. 

Cloud environments distribute traffic across multiple machines and sometimes multiple regions. During deployments or scaling events, some users may temporarily reach instances that are being replaced, warming up or failing health checks. The infrastructure is designed for availability, but brief inconsistencies are normal in distributed systems and eventual consistency models. 

This reality affects front-end reliability. Interfaces benefit from not assuming every request will succeed immediately. Instead, they should recover gracefully, allow safe retries and present clear feedback to the user. When the UI anticipates occasional failures, the application feels significantly more stable even when the back-end behavior has not changed. 

Recognizing these failures as systemic rather than accidental helps teams avoid spending time debugging code that is functioning as intended. 

Why this matters for front-end engineers 

Understanding cloud behavior changes how front-end engineers approach everyday work. Instead of assuming uniform response times and perfectly consistent data, developers begin designing for real conditions: cached responses, variable latency and temporary unavailability. 

This shift improves both debugging and design. Problems are diagnosed more quickly because the source is clearer, and user interfaces become more resilient. Loading states feel more natural, errors are more accurate and deployments cause fewer surprises. 

Front-end engineers do not need to configure infrastructure or manage environments. However, modern interfaces are the visible layer of a distributed system. Learning a small amount about how cloud platforms behave helps developers align UI behavior with system reality. 

Knowing a few AWS fundamentals does not make someone an operations specialist. It makes them a front-end engineer who understands the environment their application runs in, and that understanding often has a greater impact on user experience than additional front-end optimizations. 

Disclaimer: The views expressed in this article are my own and do not represent those of my employer. 

This article is published as part of the Foundry Expert Contributor Network.
Want to join?

(image/jpeg; 7.45 MB)

How Apache Kafka flexed to support queues 31 Mar 2026, 2:00 am

Since its initial release in 2011, Apache Kafka has cemented itself as the de facto platform for event streaming. Kafka enthusiast Tim Berglund often refers to it as the “universal data substrate.” This is made possible in large part by the Kafka ecosystem that enables connectivity between Kafka and external systems (Kafka Connect) and a Java stream processing library (Kafka Streams).

The latest release of Apache Kafka delivers the queue-like consumption semantics of point-to-point messaging. After many hours of development and testing in recent releases, this feature is generally available in Kafka 4.2.

Let’s start with a quick “compare-and-contrast” of event streaming and message queuing. Event streaming is for high-volume, real-time processing of an unbounded, continuous stream of data, and it allows for consumers to replay old events as needed. Consumer applications record the offset (the ordinal position in each topic partition) of the last event Kafka successfully processed. If a consumer terminates or restarts, it’s able to resume processing the assigned partition from the last committed offset. Example use cases include internet ad attribution, updating ride-share status, and monitoring for credit card fraud. This is the space where Kafka has thrived, with adoption by over 80% of all Fortune 100 companies.

Message queues are used for point-to-point communication, where a message is typically consumed once and removed from the queue. Unlike with event streaming, consuming applications are able to acknowledge each message. This messaging pattern decouples applications and services via guaranteed, one-time processing for tasks such as in-app notifications to mobile devices, generating payroll records, or calling an AI model. Popular platforms in this space include RabbitMQ, ActiveMQ, and IBM MQ.

These message queue use cases have been a “square peg in a round hole” for Apache Kafka. Why? For starters, scaling the “traditional” Kafka consumer group is constrained by the number of topic partitions. Most notably, Kafka consumers don’t have message-level acknowledgement semantics. These features enable consumers’ message queue systems to cooperatively operate on messages in a queue.

This is the major motivation behind KIP-932: Queues for Kafka. Let’s see how this Kafka implementation of message queuing could be an important tool in your event-driven architecture.

Scaling Kafka consumer applications

Traditionally, parallel processing of Kafka topic data is constrained by the number of partitions of the topic being consumed. The broker assigns consumption of each partition of the topic to a single member of a consumer group. Once the membership of the consumer group equals the partitions of the topic, any new consumers added to the group will be idle.

Confluent Queues for Kafka 01

This diagram illustrates three instances in a consumer group subscribed to a topic with three partitions — meaning we’ve maxed out our parallel processing potential for this topic.

Confluent

KIP-932 adds a new type of group called a share group. Nothing changes about how the data is written to Kafka by producer applications or how data is stored in Kafka. Your event streaming use cases can operate on the same topics.

Share groups introduce a new cooperative consumption model, where consumers in a share group work in a similar fashion to consumers/subscribers in message queuing systems. On the broker, each topic-partition has a corresponding share partition which tracks the lifecycle of each message in relation to the share group. This allows the share consumers to be scaled beyond the number of topic partitions.

Confluent Queues for Kafka 02

This diagram depicts the new cooperative consumption model — where multiple members of the consumer group process data from a single topic partition.

Confluent

This cooperative consumption from a topic partition also means we lose the partition-level processing order guarantees of the “traditional” Kafka consumer. That’s the trade-off for this scaling, but cooperative consumption also is intended for use cases where throughput and scaling take precedence over the order of processing.

Message-level acknowledgement

The APIs for KIP-932 should be familiar to developers who are already using Kafka. For starters, nothing changes about how events are produced to Kafka topics. On the consumer side, the KafkaShareConsumer interface is very similar to the existing KafkaConsumer. Consumer applications will poll for available messages and process each resulting ConsumerRecord instance.

The consumers now have the ability to acknowledge the delivery of each record on an individual basis. By default, every message is implicitly acknowledged as successfully processed. However, there are scenarios where the developer needs more fine-grained controls, particularly around error handling and long-running tasks.

By using the value of explicit for the consumer configuration’s share.acknowledgement.mode, the code takes on the responsibility of specifying how each message should be acknowledged. The available AcknowledgementType values are ACCEPT, RELEASE, REJECT, and RENEW. These values influence the state of each message in relation to the share group. Those states are AVAILABLE, ACQUIRED, ACKNOWLEDGED, and ARCHIVED.

Confluent Queues for Kafka 03

The state machine that controls the life cycle of messages based on these acknowledgement types is detailed in this diagram.

Confluent

Only messages in an AVAILABLE state can be fetched by a consumer. When fetched, a message transitions to the ACQUIRED state and a delivery count for that message is incremented. This effectively “locks” this message from fetches by other members of the share group.

Once ACQUIRED, a message is expected to be processed in a finite amount of time. If this “lock” or “lease” expires, the message is either sent back to the ACQUIRED state or moved to an ARCHIVED state, based on the delivery count of the message. The state and delivery count of each message is tracked in the share partition. This provides for a built-in retry mechanism developers can use in the event of a condition where the message process could be reattempted, as the message could be acknowledged using the RELEASE type.

If message processing completes successfully, that message is acknowledged with the ACCEPT type. This transitions the message to the ACKNOWLEDGED state.

There are cases where processing takes a non-deterministic amount of time. Perhaps the consumer calls a third-party or partner API. Maybe it’s augmenting the message with the result of an LLM call. These aren’t “failures,” and the processing code may need more time to complete. In this case, acknowledge the message with the RENEW type to reset the lock.

Unifying messaging protocols and infrastructure

Many organizations have both event streaming and message queuing use cases. This often means operators are maintaining and supporting Apache Kafka and an older message queuing system. Developers integrate applications with different messaging libraries and protocols in the same application code base. All of this happens as the C-suite is asking why we’re paying for multiple messaging solutions.

Consolidating these messaging use cases onto Apache Kafka will make producing applications simpler to develop, deploy, upgrade, and maintain. It will also help consumer applications scale to meet the needs and SLAs of the messages being processed.

Unlike traditional message queue systems, events in these “queues” enjoy the durability and storage guarantees we’ve come to rely on in Apache Kafka. Developers of consumer applications determine if the events should be processed as event streams or queues.

Operators and SREs (site reliability engineers) tend to like simplicity. (That could be due to the correlation between simplicity and the number of production incidents.) Unifying these messaging platforms means fewer systems to configure, deploy and patch. And that also addresses the concerns of the C-suite — lowering the total cost of ownership for the overall application infrastructure.

What queues for Kafka means for teams

KIP-932 brings long-awaited point-to-point semantics to Apache Kafka. This implementation layers queue-like consumption and message-level acknowledgment onto the durability, scalability, and throughput that have made Kafka mission-critical infrastructure for businesses from startups to large enterprises.

For development teams, this means writing applications against a single messaging API rather than juggling multiple protocols. For operations teams, it means consolidating infrastructure and reducing complexity. And for organizations, it means lower total cost of ownership without sacrificing the specific semantics each use case requires.

KIP-932 is available in Apache Kafka 4.2 and Confluent Cloud, with support coming to Confluent Platform version 8.2. Developers can explore the implementation and start testing queue-based consumption patterns now. For more about KIP-932 and other event streaming topics, visit Confluent Developer for free learning resources curated by our team of experts.

New Tech Forum provides a venue for technology leaders—including vendors and other outside contributors—to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all inquiries to doug_dineley@foundryco.com.

(image/jpeg; 13.55 MB)

Leak reveals Anthropic’s ‘Mythos,’ a powerful AI model aimed at cybersecurity use cases 30 Mar 2026, 4:55 am

Anthropic didn’t intend to introduce Mythos this way. Details of what it calls its most capable AI model yet surfaced through a data leak in its content management system (CMS), revealing a LLM with sharply improved reasoning and coding skills.

The data leak, which was the result of the company’s staffers inadvertently exposing material about the LLM, including a draft blog post about it, via a publicly accessible data repository, was first identified by independent security researchers last week.

Following disclosure of the issue, Anthropic restricted public access to the data store, only to later attribute the exposure to a configuration error in its CMS and confirm the existence of the model to Fortune, which was the first to report the leak.

Apple-focused leaker M1Astra also flagged the exposure, archiving a copy of a draft Anthropic blog post about Mythos on X before access was restricted.

In that draft, Anthropic itself struck a cautious tone, signaling concern about the model’s potential implications on cybersecurity.

“In preparing to release Claude Mythos, we want to act with extra caution and understand the risks it poses — even beyond what we learn in our own testing,” the company wrote, adding that it is particularly focused on assessing near-term cybersecurity risks.

The blog further stated that Anthropic wants to seed Mythos across enterprise security teams first and has already been testing the model’s cybersecurity prowess with a “small number of early access customers.”

The rationale seems straightforward: if today’s models can already identify and even help exploit software vulnerabilities, a more capable system like Mythos could significantly accelerate both discovery and misuse — raising the stakes for defenders and attackers alike.

Pareekh Jain, principal analyst at Pareekh Consulting, says Mythos could cut both ways for CISOs and enterprise security teams, compressing the gap between cyber offense and defense.

While at one end, models like Mythos could transform security by automating vulnerability discovery, continuous red-teaming, faster triage, and large-scale threat hunting areas, on the other hand, it could make cyberattacks easier by letting AI agents act autonomously with high skill, Jain said.

That risk for CISOs is not theoretical, Jain added, as earlier-generation models were quickly repurposed into tools for developing malware.

The risk is even higher with Mythos because of its capabilities like “recursive self-fixing,” Vladimir Belomestnov, senior technical specialist at HCLTech, wrote in a post on LinkedIn.

“The leaked files highlight a capability for the AI to autonomously identify and patch vulnerabilities in its own code. Even if this is currently limited to assisted exploitation, it suggests a narrowing gap between human and machine software engineering,” Belomestnov wrote.

However, Anthropic appears to be some distance from a full release of the model.

“Mythos is also a large, compute-intensive model. It’s very expensive for us to serve, and will be very expensive for our customers to use. We’re working to make the model much more efficient before any general release,” the copy of the draft blog post reads.

What is clear, however, is that the company is already planning a phased rollout targeting cybersecurity use cases.

“We’ll be slowly expanding access to Claude Mythos to more customers using the Claude API over the coming weeks. Since we’re particularly interested in cybersecurity uses, that’s where we aim to expand the EAP initially,” the company wrote in the draft blog post.

There is another copy of the blog post, which also names the model as Capybara. Anthropic hasn’t made it clear what the final name of the model will be.

The indecision over the model’s name, though, didn’t stop it from rattling markets last week. Shares of cybersecurity vendors, including CrowdStrike, Palo Alto Networks, Zscaler, and Fortinet, fell as investors assessed what more capable models within Claude Code Security could mean for the competitive landscape.

However, Avasant’s research director, Gaurav Dewan, was more optimistic about Mythos’ impact on vendors: “Powerful models will not replace cybersecurity platforms”.

Rather, Dewan sees vendors increasingly embedding frontier models from Anthropic and OpenAI and others into their stacks for vulnerability discovery, code and cloud posture management, and threat investigation and response automation.

“One can expect partnerships and controlled integrations, not disintermediation. Vendors that already own telemetry, workflows, and enforcement will benefit most,” Dewan added.

The article originally appeared in CSO.

(image/jpeg; 8.67 MB)

The starkly uneven reality of enterprise AI adoption 30 Mar 2026, 2:00 am

Paraphrasing William Gibson, the future of AI is here, but it’s nowhere close to evenly distributed yet.

Last week in London, I had two conversations about enterprise AI that obliterated any semblance of a neat and tidy story of AI adoption. In the first meeting, the head of engineering at a large hedge fund told me about engineering teams with fleets of agents in full production, and (in his personal case) all code is written by LLMs. (Junior hires, interestingly, aren’t allowed to use LLMs for code assistance.) In another meeting, a data engineer at a large retail bank described the exact opposite: No agents and sparse use of LLMs. Maybe other parts of the bank are moving faster on AI, but his division clearly isn’t.

This isn’t about one company “getting” AI and the other not. Rather, it’s a reminder that even within the same company there are wildly divergent adoption curves for new technologies. AI is widening the gap between teams that can absorb it operationally and teams that can’t. That’s what the best recent data suggests, too. McKinsey found that 88% of respondents say their organizations are using AI in at least one business function, but only about one-third say their companies have begun scaling AI programs. As for agents, 23% report scaling an agentic AI system somewhere in the enterprise, while 39% are still just experimenting. And in any given function, no more than 10% say they’re scaling agents.

Broad usage, in other words, is not the same thing as deep institutional change. In short, there’s still time to figure out AI. You’re not behind.

Cue the engineering boom

I keep hearing that “finance is cautious” or “regulated industries are behind” or “everyone is building with agents.” None of that is quite true. Some financial firms are moving aggressively. Some aren’t. Some teams inside the same firm are doing both at once. Deloitte’s 2026 enterprise AI research makes the same point from another angle. Only 25% of respondents said they had moved 40% or more of their AI pilots into production. Just 34% say they’re using AI to deeply transform their businesses (a number I suspect is more aspirational than actual), while 37% are still using it at a surface level with little or no change to core processes. That sounds a lot less like a tidal wave and a lot more like a messy, uneven organizational test.

Same as it ever was, right?

And that, in turn, is why I think a lot of the “AI will wipe out software jobs” talk is wrong and misses the point. The interesting thing about AI coding tools isn’t that they make software cheaper to produce. It’s what companies do with that lower cost. Box CEO Aaron Levie recently invoked Jevons paradox to explain exactly this dynamic: When a capability becomes cheaper and easier to consume, demand for it often rises rather than falls. That’s not a law of nature, but it is a pretty good description of what technology has done for…ever. Cloud computing didn’t lead companies to need less compute. It made them build more things that consumed compute. AI-assisted coding may be doing something similar for software itself.

This is where the data on engineering jobs gets interesting. Lenny Rachitsky recently highlighted that engineering openings are at their highest levels in more than three years. The underlying TrueUp data shows 67,665 open engineering jobs as of March 2026, up 78.2% from the recent low. More importantly, this isn’t just concentrated at the very top end of the market. TrueUp’s breakdown shows 44.6% of posted engineering roles within tech companies are entry and mid-level, versus 38.3% at senior level and 13.8% at senior-plus. So no, the data doesn’t say AI is eliminating roles for junior developers; rather, it says companies still want a lot of engineers, even as AI tools spread throughout the profession.

There’s a cleaner way to understand what’s happening. AI isn’t killing the need for engineers. It’s changing what enterprises want from engineers.

Stack Overflow’s 2025 survey found that 84% of respondents are using or planning to use AI tools in development, and just over half of professional developers use them daily. McKinsey’s software development research found that the highest-performing AI-driven software organizations are seeing 16% to 30% improvements in productivity, customer experience, and time to market, along with 31% to 45% improvements in software quality. But McKinsey’s crucial point is that these gains don’t come from sprinkling copilots over an unchanged process. They come from reworking roles, workflows, and the full product development system. That’s a much harder organizational challenge than buying licenses for a coding assistant.

Software engineering is alive and well

Let’s go back to my conversations in London. The hedge fund leader may be an early glimpse of where parts of enterprise engineering are headed. Less time hand-authoring code, more time specifying, reviewing, steering, and orchestrating systems that increasingly generate code for you. But that does not mean the retail bank division is irrationally lagging. In a heavily regulated environment, code generation is not the hard part. Governance is. Deloitte reports that only 21% of surveyed companies currently have a mature governance model for autonomous agents (and those 21% are probably kidding themselves). At the same time, 73% cite data privacy and security as a top risk, and 46% cite governance capabilities and oversight. That’s not bureaucracy for its own sake. It’s a recognition that plugging non-deterministic systems into deterministic, compliance-heavy environments gets messy fast.

Still, caution isn’t free. Every quarter a team spends in pilot mode is a quarter in which more aggressive peers are building operational muscle. OpenAI’s enterprise usage data is useful here because it shows how uneven that muscle-building already is. Frontier workers, defined as the 95th percentile of adoption intensity, send six times more messages than the median worker. Frontier firms send twice as many messages per seat. OpenAI says the primary constraints are no longer model performance or tools, but rather organizational readiness and implementation.

This rings true to me. In my experience, the real divide is increasingly not between companies that have access to AI and those that don’t. It’s between teams that have learned how to integrate AI into repeatable work and teams that are still treating it as a promising but dangerous sideshow, as I’ve written.

This is also why I think the distinction of task versus job matters. Writing a chunk of boilerplate code is a task. Engineering is a job. Jobs bundle judgment, trade-offs, accountability, architecture, security, integration, testing, and the ugly reality of operating systems in the real world. AI can automate more tasks, but it hasn’t eliminated the need for jobs, especially in environments where bad software decisions carry real operational or regulatory consequences. In fact, McKinsey’s broader AI survey found that most organizations are still navigating the transition from experimentation to scaled deployment, and that high performers stand out precisely because they redesign workflows and treat AI as a catalyst for innovation and growth, not just efficiency. That is a very different thing from saying, “We gave everyone a chatbot and now we need fewer people.” (By the way, that would be a very naive statement.)

So no, AI isn’t plodding (or rocketing) toward one uniform enterprise future in which software engineers quietly fade away. Instead AI is splitting enterprises into fast-learning and slow-learning teams and is rewarding organizations that redesign work, govern risk, and turn lower software costs into more software, not less. The code may be getting cheaper, but the ability to decide what should be built, how it should fit together, and how to keep it from breaking the business keeps increasing in value.

That’s not the death of software engineering. It’s the repricing of it, and every company and every team is paying different prices.

(image/jpeg; 7.18 MB)

How to build an enterprise-grade MCP registry 30 Mar 2026, 2:00 am

Just as integration catalogs were must-haves at the peak of SaaS, Model Context Protocol (MCP) servers are now becoming all the rage for connecting AI agents and enterprise systems. 

In this paradigm, developers aren’t hand-coding API calls to external systems, nor are users clicking “click to integrate” and entering credentials into GUIs. Instead, agentic systems are looking up available MCP servers and making MCP tool calls autonomously. To guide this process, MCP registries are emerging as a way to catalog available MCP servers and guide AI agent workflows.

“MCP registries are increasingly becoming the integration catalog for agentic systems,” says Ebrahim Alareqi, principal machine learning engineer at Incorta, provider of an open data delivery platform. “They give developers and platform teams a centralized inventory of the tools, agents, and capabilities available to an organization.”

MCP registries shorten time to integration and act as a discovery point for AI agents. But whether you’re repurposing an off-the-shelf MCP registry or building your own, the job comes with daunting technical challenges, like figuring out semantics for tool discovery and adding in guardrails for safe autonomous usage.

“A good MCP registry is more than a directory of tools,” says Derek Ashmore, agentic AI enablement principal at Asperitas Consulting, a cloud computing consultancy. “It’s part of your control plane.” For Ashmore, an MCP registry needs strong identity and discovery, policy-aware metadata, life-cycle controls, security guardrails, and data to inform observability.

Below, we’ll dive deeper into what makes up a solid, functional MCP registry. We’ll explore the features and requirements of MCP registries, examine the emerging implementation advice for enterprises, and determine when you should use private versus public MCP registries.

What is an MCP registry?

An MCP registry acts as a single source of truth for MCP servers. It’s a catalog of approved, compliant MCP servers and MCP tools that are available within an organization that can be exposed to AI agents. By pointing AI agents to an MCP registry endpoint, an enterprise can equip AI workflows with actionable read-write access across engineering, business, and SaaS systems that are sanctioned and configured for company use.

To date, there are a handful of public MCP registries out there, ranging from open directories to curated registries and more enterprise-ready implementations. The most obvious example is the official MCP Registry, an open-source catalog of MCP servers with a live REST API for search and discovery. The official MCP registry aligns with the MCP registry specification, which provides a standardized method to build interoperable MCP registries.

Other public resources are more like static lists, such as directories from MCP.so, to Glama.ai, Mastra.ai, and OpenTools. Interestingly, one open-source tool, MCP-Get, provides a command-line option for interaction.

More and more digital services are beginning to embed MCP catalogs into their platforms, too. Docker and Microsoft, for instance, are building curated MCP catalogs focused on their own platform ecosystems. GitHub hosts a directory of MCP servers for easy installation, and has begun to add controls for internal registry configurations. The MACH Alliance, an industry consortium focused on composable commerce, is also promoting an MCP-compliant registry initiative.

Beyond these efforts, MCP registries are moving beyond public directories — enterprises are now constructing private, self-hosted registries for governed internal MCP use. While examples of private enterprise MCP registries are nascent, they are undoubtedly being aided by emerging features in infrastructure platforms.

Take the MCP Center, powered by the Azure API Center, which demonstrates how to build MCP registries in Azure. Lunar.dev’s Custom MCP Server Registry also allows admins to create their own scoped internal MCP registries.

The benefits of an MCP registry

“The biggest benefit of an MCP registry is discoverability,” says Justin O’Connor, founder at Infracodebase, an agentic platform for cloud infrastructure, which hosts a public MCP registry for connecting AI agents to cloud providers.

“MCP servers often end up scattered across teams and systems, so a registry gives you one clear place where people can find what exists,” says O’Connor. This allows AI agents to discover tools with less trial and error.

Others agree that improved discovery is a much-needed element for autonomous agents. “MCP is designed to ensure agents have enough context to generate the right response, and registries are a natural extension of that,” says Incorta’s Alareqi.

A well-constructed MCP registry brings uniformity that aids adoption, reuse, and governance, says O’Connor, because it can be treated as an official inventory of approved capabilities. As such, MCP registries serve a similar function to package registries for software, he adds. 

In this way, an MCP registry can act as a source to vet and update MCP servers before exposing them to agents. Although an MCP registry doesn’t replace core authentication requirements for each MCP server, it does aid provenance and supply chain security, says O’Connor.

Core elements of an enterprise-grade MCP registry

While they share similarities with traditional software integration catalogs, MCP registries have some unique elements. “If you are treating the AI or MCP registry as just another static catalog, you’re doing it wrong,” says Christian Posta, VP and global field CTO at Solo.io, a cloud-native infrastructure company.

Many elements make up a high-quality MCP registry beyond a static tool catalog. In general, they can be boiled down to rich tool metadata, features for developers, and enhanced security guardrails. The effectiveness of an MCP registry will also depend on underlying MCP and API security best practices.

Rich tool metadata

First, an MCP registry needs the bread-and-butter details required to function with MCP. “A solid MCP registry needs to support the basics required by the protocol,” says O’Connor of Infracodebase. This includes how to connect to a server, the transport type, server URL, and configurations required, like environment variables or secrets.

Next are details to aid tool discovery. An MCP registry must provide methods for AI agents to automatically discover the appropriate underlying MCP tools. 

According to Posta, making tools discoverable requires resources that enable semantic search, such as embeddings of the tool name, description, and input schema, along with clear summaries. Ideally, he adds, this experience layer supports progressive disclosure to optimize context windows.

“Agents need context,” adds Incorta’s Alareqi. “Metadata around capabilities, schemas, side effects, cost, latency, and failure modes, to name a few, is what allows an agent to choose the right tool.”

William Collins, director of tech evangelism at Itential, provider of an infrastructure orchestration platform, also sees semantic cues as necessary for discovery, along with other metadata. These should flag rich semantic metadata beyond endpoint descriptions, versioning with breaking-change signaling, and clear capability scoping, he says.

Developer controls

Although agents will use MCP registries programmatically, the registries still must be maintained by human developers (most likely by platform engineers). The MCP registry should therefore provide controls to add new servers, remove them, and set privileges.

To streamline this, a key aspect of an effective MCP registry is good developer experience, says Ido Halevi, director of product management at Silverfort, an identity security company. “That means clear documentation, examples of usage from other teams, and reliability signals such as active maintenance and adoption across agents,” Halevi says.

A strong registry also provides context beyond being a basic tool list. “Teams need to know whether an MCP server is maintained, how widely it’s used, and what kinds of risks or privileges it requires,” says Jessica Kerr, engineering manager of developer relations at Honeycomb, an observability platform provider. For instance, Kerr suggests adding lightweight moderation controls to flag dependable versus experimental MCP servers.

Security guardrails

Since the concept of MCP registries is so new, security standards and guidelines are still emerging. “It’s a bit like the wild west,” says Gil Feig, co-founder and CTO of Merge, provider of a unified API platform. 

Because of this, Feig emphasizes the need for strong security guardrails and privilege boundaries. “When evaluating an MCP registry, look for one that offers robust authentication, observability, and data governance with built-in rules, proactive alerts, and real-time logs,” he says.

The authorization context will especially matter to ensure that agents are using MCP tools permitted by the organization and have authorized access to sensitive material. As such, MCP registries will require information on the agent identity, its intent, and what user it’s acting on behalf of, says Posta. 

“Registries should favor servers that properly separate user sessions so data does not leak between users,” adds O’Connor, who notes that support for per-user authentication using modern OAuth patterns helps ensure access that is matched to privileges.

Similarly, Halevi underscores the need for enforcement beyond pure tool discovery. “Without enforcement, all you’re doing is cataloging risk,” he says. A registry should help control which agents can access which tools, and dynamically enforce permissions when a tool is invoked.

Native API handling underneath

Native API handling notwithstanding, there’s only so much a registry can do. Core authentication nuances will differ from MCP server to MCP server, and each will require the same security rigor as a standard API connection.

“At the server level, MCP servers must be built with robust security capabilities from the ground up,” says Alex Salazar, co-founder and CEO of Arcade.dev, the maker of an AI tool calling platform. An MCP registry doesn’t replace core MCP server security basics such as OAuth-based authentication, proper token and secrets handling, and observability.

“The issue here is many AI applications don’t have any native API handling in place,” adds Melissa Ruzzi, director of AI at AppOmni, a cybersecurity company. “So they look to the MCP registry as a way to control MCP authentication, which is not a good practice.” 

Others aren’t certain guardrails belong at the registry level to begin with. “Security guardrails and privilege boundaries are really the responsibility of the underlying agents and not the best function of a registry-as-exchange,” says Dan Fink, AVP software architect at Cognizant, an enterprise technology consulting firm.

To really enforce this, adds Fink, you’d require additional layers that would either be too heavy, like introducing entirely new agents as intermediaries, or just simple guardrail tags, which could easily be faked or obsoleted.

For this reason and others, some view the MCP registry itself as more of an abstraction layer, which only defines high-level capabilities that are then mapped to underlying scopes, roles, and APIs. 

“Registries should express guardrails so orchestration layers can enforce them,” says Itential’s Collins. “This way, the registry doesn’t become a bottleneck and single point of failure.” For Collins, guardrails to enforce at the registry include privilege boundaries, authentication requirements, and risk classifications.

“An enterprise MCP registry should be slightly abstracted, not one-to-one with every tool privilege,” says Asperitas Consulting’s Ashmore. A thin abstraction layer, as opposed to one that directly mirrors every underlying permission, also enables you to standardize permission names across tools, reuse role templates, and separate user types, he adds.

Life cycle and performance

As a tagalong to security guardrails, an MCP registry is an opportune location to introduce supply chain security features and monitoring.

“This includes vetting servers before they’re discoverable, implementing security scans and vulnerability checks, and controlling what can be published or discovered,” says Arcade’s Salazar. He says that registries should track performance metrics and errors, as well.

In addition to dynamic tool discovery and tooling governance, Marco Palladino, CTO and co-founder of Kong, provider of a cloud-native API platform, sees observability across the AI data path as necessary for an enterprise-grade MCP registry.

“Enterprises need centralized visibility into tool usage, health, and failures to support monitoring, optimization, cost management, and compliance,” says Palladino. “Without this, organizations face fragmented integrations and increased operational risk.” 

Beyond the above areas, experts foresee that other attributes will be necessary for MCP registries in an enterprise context: 

  • Fingerprinting of the tools within a particular server
  • A bridge between private and public registries
  • Ranking or scoring based on previous performance, token cost, and other attributes
  • Namespace verification to prevent naming conflicts
  • Validation layers to catch errors
  • Health monitoring to track server availability and performance 

Choosing a public or private MCP registry 

When implementing an MCP registry, organizations have two options: either use a public MCP registry or create a private self-hosted MCP registry. According to the experts, there are trade-offs between each approach.

“A public MCP registry has to be very well evaluated for possible security risks before use,” says AppOmni’s Ruzzi. Private registries are generally safer, she says, but the degree of risk depends on how they are implemented.

“The public registry ecosystem is still immature,” says Kevin Cochrane, CMO at Vultr, a cloud hosting provider. “We likely need a ‘Hugging Face for MCP’ — a trusted authority that can validate listings and set consistent standards.” Without that sort of layer, teams should be cautious about smaller third-party registries, he adds. 

Instead, a private MCP registry can help an enterprise govern its portfolio. “Put a private MCP registry at the heart of the AI runtime,” Cochrane says. “This should be core infrastructure owned by platform engineering, with governance over how MCP servers are built, tested, deployed, and monitored.”

Infracodebase’s O’Connor adds that such curated registries engender trust in specific tools. “Over time, registries also become a trust boundary, especially in public settings, because they shape what tools people are willing to bring into workflows,” he says.

For many, the starting point will likely be a combination of both. This could equate to forking a sample open-source MCP registry and extending it to your needs. 

“Another way is to take a published OpenAPI specification and generate a skeleton service implementation in a language of your choice,” says Andrei Denissov, associate director of software engineering at Cognizant AI Lab, the AI research arm for Cognizant.

Tips on building MCP registries

Experimentation with MCP registries is in its early days. However, developers on the front lines are already pulling out lessons learned and discovering patterns for both good and bad designs. 

One lesson is the sheer realization that you need registries, quicker than you think. “Working with teams deploying MCP at an enterprise scale, the pattern is consistent: Registries become necessary faster than organizations expect,” says Silverfort’s Halevi. 

Then, those implementing MCP registries quickly learn that a basic MCP catalog is only one part of the picture — enterprises need much more than just MCP tool discovery. They need per-agent authorization models, guaranteed human-linked attribution, deep observability into agent behavior, and inline enforcement,” says Halevi.

When operating many MCP servers at scale, other requirements beyond discovery begin to become just as important, adds Halevi, such as MCP server orchestration, managing keys, keeping versions aligned, and managing configuration changes.

Balancing agentic autonomy and control

In the enterprise, sanctioned MCP use is proving to be incredibly powerful. Just take the case of Workato, which experienced a 700% increase in Claude chats from internal employees over a 60-day period when it turned on enterprise MCP features. Support engineers, financial analysts, sales leads, and others are building new workflows that grow Workato’s business in tangible ways, much in part thanks to MCP.

Getting those results, however, requires balancing agentic autonomy with control. That’s where an MCP registry can shine. For an enterprise, the quality of an MCP registry doesn’t just depend on listing every MCP server in a directory. It hinges on trust, safety, and smart controls — especially to prevent leaking data from chat streams across inter-organizational agent workflows, for instance.

As such, enterprises going “all in” on MCP should seriously consider MCP registries as a core infrastructure, with all the standard architectural enterprise bells and whistles. “It should be treated like any other serious piece of software,” says Alareqi. “That means strong versioning, life-cycle management, and observability.”

(image/jpeg; 5.21 MB)

Kotlin 2.3.20 harmonizes with C, JavaScript/TypeScript 27 Mar 2026, 3:18 pm

Kotlin 2.3.20 has become the latest version of the JetBrains-built language, featuring an interoperability mode for C or Objective-C libraries and name-based destructuring declarations for property names. Developers also can leverage Kotlin interfaces on JavaScript and TypeScript.

The update to the Java rival language was introduced March 16. Instructions for getting started with the language can be found on the Kotlin website. With the Kotlin Native technology in Version 2.3.0, for compiling Kotlin code to native binaries, developers can try the now-experimental interoperability mode for Objective-C and C libraries. This capability is geared to developers who use C or Objective-C libraries in Kotlin Multiplatform (KMP) libraries or applications. In general, Kotlin Native enables importing C and Objective-C libraries into Kotlin. However, for KMP libraries, this functionality is currently affected by the KMP compatibility issues with older compiler versions. Thus, if a KMP library compiled with one Kotlin version is published, importing C or Objective-C libraries might make it impossible to use that Kotlin library in projects with an earlier Kotlin version. To address this and other issues, the Kotlin team has been revising the interoperability mechanism. Starting with Kotlin 2.3.20, developers can try the new mode through a compiler option.

Also Kotlin 2.3.20 introduces name-based destructuring declarations that match variables to property names instead of relying on position-based componentN() functions. Previously, destructuring declarations used position-based destructuring, JetBrains said.

The update lifts the limitation on implementing Kotlin interfaces on the JavaScript and TypeScript sides, JetBrains said. Previously, it only was possible to export Kotlin interfaces to TypeScript as TypeScript interfaces; implementing them from TypeScript was forbidden. Additionally, starting with Kotlin 2.3.20, Kotlin/JS supports the SWC Rust-based compilation platform. This helps with transpiling newer versions of JavaScript and TypeScript code into older and more compatible JavaScript code.

Kotlin 2.3.20 follows the December 2025 release of Kotlin 2.3.0 and the February release of Kotlin 2.3.10. Elsewhere in Kotlin 2.3.20:

  • For Java interoperability, the compiler now recognizes the Vert.x @Nullableannotation for nullability checks. This release also adds support for the Java @Unmodifiable and @UnmodifiableView annotations to treat annotated collections as read-only in Kotlin.
  • It is easier to set up Kotlin in Maven build tool projects. Now, Kotlin supports the automatic configuration of source roots and Kotlin’s standard library.
  • Kotlin 2.3.20 is fully compatible with Gradle build tool Versions 7.6.3 through 9.3.0. Developers also can use Gradle versions up to the latest Gradle release. Developers should be aware that doing so may result in deprecation warnings, and some new Gradle features might not work.
  • The Lombok compiler plug-in for generation and use of Java Lombok declarations has been promoted to alpha status. Plans call for making this functionality production-ready, but it is still under development.
  • The Map.Entry.copy() extension function is introduced for creating an immutable copy of a Map.Entry. This function allows for reusing entries obtained from Map.entries after modifying the map by copying them first.

(image/jpeg; 3.06 MB)

Final training of AI models is a fraction of their total cost 27 Mar 2026, 10:08 am

AI models cost a lot more to develop than you may think. AI research company Epoch AI has set out all the costs of building a new AI model — and explaining why AI companies are so concerned about perceived threats to their intellectual property.

It has looked into this before: Last year, it estimated that of OpenAI’s $5 billion expenditure on R&D, only about 10 percent went on the final training runs, with the majority going on scaling, synthetic data generation, and basic research.

At the time, Epoch was unsure whether this was a peculiarity of OpenAI but now two Chinese companies, MiniMax and Z.ai, have also disclosed their R&D compute spending, and Epoch has found that, despite the differences in company size, final training runs are only a small part of the Chinese companies’ R&D expenditure too.

Epoch set out more detail about the issue. It said that if “most of the spending is exploration rather than execution, then a competitor who learns what works from the frontier could replicate the results for a fraction of the original cost.”

This has been a concern of US AI companies for some time.  Google has already expressed concerns about intellectual property theft. And Anthropic has fingered MiniMax as a company that has sought to extract Claude’s capabilities to enhance its own offerings. It’s clear that any business looking to develop AI models is going to be committing to spend huge sums of money: The training is just a small part of it.

(image/jpeg; 4.8 MB)

OpenAI adds plugin system to Codex to help enterprises govern AI coding agents 27 Mar 2026, 5:27 am

OpenAI has introduced a plugin system for Codex, its AI-powered software engineering platform, giving enterprise IT teams a way to package coding workflows, application integrations, and external tool configurations into versioned, installable bundles that can be distributed or blocked across development organizations.

“We’re rolling out plugins in Codex,” OpenAI Developers, the company’s official developer account, posted on X.  “Codex now works seamlessly out of the box with the most important tools builders already use, like Slack, Figma, Notion, Gmail, and more.”

Plugins are “installable bundles for reusable Codex workflows” that “make it easier to share the same setup across projects or teams,” an OpenAI developer portal documentation noted. Each bundle can contain skills, which the documentation describes as prompts that the Codex agent can discover and execute, along with optional application integrations and Model Context Protocol server configurations that give the agent access to remote tools or shared context, it added.

A governance layer for agentic AI

How those bundles are distributed and governed is controlled through a separate policy layer, the documentation said.

Organizations can define plugin catalogs, called marketplaces, in JSON files scoped either to a repository or to an individual developer’s environment. Each plugin entry carries an installation policy with values including “INSTALLED_BY_DEFAULT,” “AVAILABLE,” and “NOT_AVAILABLE,” giving administrators the ability to push, restrict, or block plugins across the developer workforce, the document added. Authentication behavior is configurable at the policy level as well.

The plugin feature is the latest in a run of enterprise-focused additions to Codex since OpenAI announced the platform’s general availability in October 2025, when it said Cisco had reported pull request review times falling by as much as 50% after deployment. Admin tooling released at the same time gave ChatGPT Business, Edu, and Enterprise customers environment controls, usage analytics dashboards, and managed configuration options for the Codex CLI and IDE extension.

“Centralized control over which plugins are permitted, blocked, or deployed by default directly addresses concerns around security, compliance, and operational consistency,” said Charlie Dai, VP and principal analyst at Forrester. “It aligns AI agents with existing IT governance models rather than bypassing them.”

Adoption will be gradual, Dai said. “While technical tooling is advancing quickly, most enterprises will adopt this incrementally, led by platform engineering and developer productivity teams,” he said.

Agent behavior as managed infrastructure

Beyond the pace of adoption, Dai said the plugin system signals a broader shift in how enterprises are expected to manage AI-assisted development.

“By encapsulating standards, workflows, and tool access into versioned artifacts, organizations elevate AI-assisted development from ad hoc usage to managed infrastructure,” he said.

That distinguishes Codex from its main rivals. GitHub Copilot Extensions, which reached general availability in early 2025, lets developers invoke third-party tools from Copilot Chat inside Visual Studio Code, JetBrains IDEs, and GitHub.com, with a public marketplace hosting extensions from vendors including Docker, Sentry, and Perplexity. The emphasis is on contextual tool access during chat sessions rather than governing agent behavior at scale.

Cursor, another rival, launched its own plugin marketplace in February. The company expanded it this month, adding more than 30 integrations from partners including Atlassian, Datadog, and GitLab, according to Cursor’s changelog. Teams and Enterprise administrators can also create private marketplaces for controlled distribution.

Anthropic has moved in a similar direction, introducing workflow automation plugins for its Claude Cowork platform earlier this year.

“Compared with GitHub Copilot or Cursor, OpenAI is extending beyond policy enforcement into behavioral standardization,” Dai said. “Competitors focus primarily on permissions and guardrails; Codex begins to formalize execution patterns at scale.”

The missing third-party ecosystem

That behavioral standardization, however, has a notable constraint for now.

OpenAI has not opened self-serve publishing to its official plugin directory. “Adding plugins to the official Plugin Directory is coming soon,” the documentation said. “Self-serve plugin publishing and management are coming soon.” Organizations are limited for now to private marketplaces scoped to a repository or to an individual developer’s environment.

On the other hand, GitHub’s marketplace has been open to third-party builders since early 2025. Cursor’s marketplace already lists more than 30 external partners. OpenAI’s directory so far contains only plugins curated by the company itself.

“Long-term platform stickiness will depend on a curated third-party ecosystem that expands capability breadth and accelerates innovation,” Dai said. “Mature enterprises will expect audited, interoperable plugins for domain-specific tooling and regulated workflows. Without this external ecosystem, Codex risks limited extensibility beyond core engineering use cases.”

(image/jpeg; 13.05 MB)

Anthropic throttles Claude subscriptions to meet capacity 27 Mar 2026, 4:48 am

Anthropic has started limiting usage across its Claude subscriptions to cope with rising demand that is stretching its compute capacity.

“To manage growing demand for Claude we’re adjusting our 5 hour session limits for free/Pro/Max subs during peak hours. Your weekly limits remain unchanged,” Thariq Shihipar, a member of Anthropic’s technical staff, wrote in a post on X.

“During weekdays between 5am–11am PT / 1pm–7pm GMT, you’ll move through your 5-hour session limits faster than before,” Shihipar added.

In effect, the change concentrates tighter usage controls during peak global working hours, when demand from both enterprise and individual users is highest.

The rationale here is that by accelerating how quickly users hit their session limits within these windows, Anthropic is effectively redistributing access to prevent system overloads while still preserving overall weekly usage quotas.

Pro users affected

Anthropic’s Shihipar, in his post, was referring to Claude’s subscription plans, which include usage limits, unlike Claude’s API plans, which are unaffected by the change.

The subscription plans’ usage limits, according to the model provider, are defined as the enabler for controlling how much a user can interact with Claude over a specific time period.

“Think of this as your ‘conversation budget’ that determines how many messages you can send to Claude, or how long you can work with Claude Code, before needing to wait for your limit to reset,” the documentation reads.

It is worth noting that Anthropic doesn’t define pricing for its subscription plans as clearly as its API-based plans, which is dependent on multiple factors, including base input tokens, cache write, and output tokens.

The change in usage limits, Shihipar later wrote, will impact approximately “7% of users”, particularly pro tiers. As a bypass, the senior technical staff recommended running “token-intensive background jobs” during “off-peak hours” to stretch session limits.

Push to adopt API-based plans?

Analysts say the move would create issues for users.

“The impact is largely limited to individual users, prosumers, and small teams using Claude via subscription plans, where usage caps and throttling are expected to manage shared compute and costs,” said Pareekh Jain, principal analyst at Pareekh Consulting.

Enterprises, Jain added, are typically mostly insulated as they typically rely on API-based consumption or dedicated contracts.

Having said that, though, Jain says power users in enterprises would also be affected, resulting in slow experimentation or slower pilots, which he says could be an Anthropic strategy to push teams towards API adoption.

That push, according to Forrester VP and principal analyst Charlie Dai, could mean more guaranteed revenue for the model provider.

Offering a contrarian view on how the change affects large enterprises, Greyhound Research chief analyst Sanchit Vir Gogia pointed out that most enterprises are not operating in a clean, API-only model and, in reality, usage is fragmented across subscription tiers, team environments, developer tooling, and API integrations.

“It is within this blended environment that the impact begins to surface. Subscription layers are no longer peripheral. They power real workflows, particularly in development, analytics, and rapid execution scenarios. When those layers become inconsistent during peak demand, enterprise productivity is affected indirectly but meaningfully,” Gogia said.

In fact, Gogia, too, sees the change forcing enterprises to choose API-based plans to ensure productivity: “Enterprises are entering a phase where performance consistency is no longer assumed. It must be architected, negotiated, and paid for. If demand continues to outpace infrastructure capacity, this segmentation will become more explicit.”

More so because analysts see Anthropic’s rivals and other vendors taking similar routes in terms of balancing usage with capacity.

“Since all major vendors are either introducing or will introduce similar constraints, impacted users may not get relief by moving to another vendor platform. They would be effectively moving across different forms of limits rather than escaping them entirely,” Jain said.

Limited backlash due to limited options

That same logic is also why Jain sees limited backlash from users, especially when it comes to switching vendors, as a response to the policy change from Anthropic.

Echoing Jain, Avasant’s research director Chandrika Dutt pointed out that the capacity throttling measures being implemented by model-providers closely mirror strategies employed by hyperscalers during the early days of cloud computing.

Cloud providers, such as Amazon Web Services, Microsoft Azure, and Google Cloud, Dutt said, faced similar capacity constraints and instead of scaling instantly, they introduced mechanisms to shape demand and smooth consumption patterns, such as reserved capacity models and pricing incentives to shift usage to off-peak periods.

(image/jpeg; 2.12 MB)

Edge clouds and local data centers reshape IT 27 Mar 2026, 2:00 am

For more than 10 years, enterprise cloud strategy has relied on centralizing as much as possible—shifting workloads from data centers, consolidating operations on hyperscale platforms, and leveraging economies of scale. This approach has reduced infrastructure sprawl, accelerated deployment, and provided nearly unlimited compute and storage. However, the next generation of digital systems increasingly interacts with regional regulations, real-time decision loops, and the physical world in general. These factors do not tolerate distance well. Smart traffic systems can’t wait for a round-trip to distant cloud regions. Industrial control systems can’t halt operations because a wide-area link is congested. AI-driven video analytics becomes costly and inefficient when every frame must be sent back to a centralized platform for inference. In these environments, it matters where the data is created and processed and where decisions are made.

The future of cloud computing is neither more nor less centralized. It is selectively distributed, with edge cloud and localized data centers becoming essential in situations where latency, sovereignty, and physical-world responsiveness matter most.

That is the real story behind the rise of edge cloud. It’s not hype, a complete reversal of cloud adoption, or a nostalgic return to on-premises infrastructure. Instead, what’s emerging is more practical dual architecture: a centralized cloud for aggregation, model training, cross-region coordination, and platform services; with local infrastructure for time-sensitive processing, regional independence, and compliance-driven workloads.

Use cases for edge clouds

Edge cloud involves deploying compute, storage, and networking resources closer to users, devices, and data sources. It looks like telecom facilities at the metro edge, or micro data centers in hospitals, retail outlets, factories, or municipal centers. These localized data centers support workloads that benefit from proximity, embodying the regional computing principle of placing workloads where they are most operationally and economically effective.

The trend is accelerating because multiple forces are converging at once. Low-latency applications are moving from pilot projects to full production. AI is transitioning from centralized training to distributed inference. Data residency laws are becoming more specific and easier to enforce. Enterprises are also realizing that bandwidth is limited, and transmitting massive amounts of sensor, video, and telemetry data to a central cloud can often be a poor design choice hidden behind architectural simplicity.

Consider smart cities. Municipal systems are no longer limited to back-office software and basic public websites. City systems now include connected traffic lights, intelligent surveillance, environmental sensors, safety systems, transit monitoring, and energy efficiency platforms, all generating continuous streams of local data that require immediate responses. Detecting congestion, hazards, or emergency vehicle routes at intersections demands quick action. Relying on distant cloud analysis can delay responses, risking public safety.

The same logic applies in industrial settings. Connected factories increasingly use machine vision, predictive maintenance models, robotics, telemetry, and digital twins to boost throughput and minimize downtime. Much of that data has local value first and global value second. A detection model for defects running alongside a production line can stop defective output in real time. A centralized system can still gather data for fleet-wide analytics, training, and optimization, but it should not be on the critical path of every local decision. This is where edge cloud delivers tangible business value as a way to keep local operations fast, resilient, and cost-effective.

Healthcare can’t rely solely on a centralized cloud system. Regional setups depend on imaging, monitoring, connected devices, and patient-facing services. Some workloads must remain local because of privacy concerns, network limitations, or response time requirements. Hospitals need local computing for imaging, decision support, and operations that can’t risk WAN failures. At the same time, they require centralized platforms for analytics, model development, and data integration. Hybrid is the best operating model.

Retail demonstrates another vital aspect of edge: local processing for personalization, inventory, checkout, and analytics. Pushing all transactions to a central platform is costly, especially when business value is immediate and local. Stores that adapt staffing, promotions, or fulfillment in real time gain an edge. This doesn’t mean abandoning centralized platforms but rather extending them with localized execution.

Telecom providers, colocation operators, and cloud vendors recognize this opportunity. Telecom companies aim to monetize network proximity by converting metro infrastructure into application platforms. Colocation providers position regional facilities as neutral points for latency-sensitive workloads, data exchange, and multi-cloud interconnection. Hyperscale cloud vendors respond by expanding managed services through local zones, distributed appliances, and edge-specific platforms. Everyone strives to control the plane in a world where compute becomes increasingly decentralized.

When hype outruns architecture

Deploying edge infrastructure is easy to celebrate in strategy decks because it sounds modern and inevitable. However, operating it at scale is much less glamorous. Managing a centralized cloud region is already challenging, but having hundreds of distributed sites with hardware limitations, physical exposure, inconsistent connectivity, and varying operational maturity presents a completely different set of problems. The issue isn’t just deploying small clusters across many locations. It involves life-cycle management, security hardening, observability, orchestration, failover, and governance within an inherently fragmented estate.

Security complexity rises as each distributed site increases the attack surface. Remote, diverse infrastructure makes patching harder. Identity, certificates, and policies must be consistent across locations with varying staffing and controls. Many underestimate the operational burden, thinking edge is just cloud with shorter networks.

Observability remains a significant gap. Distributed systems fail in distributed ways, which rapidly multiplies blind spots. If enterprises cannot monitor what is happening across thousands of nodes, local clusters, gateways, and data pipelines, they are not truly operating at the edge—they are building up technical debt in smaller units.

Interoperability also remains underdeveloped. Despite vendor claims, many edge solutions are still too tightly linked to specific hardware stacks, connectivity methods, or cloud ecosystems. This creates lock-in risks exactly when enterprises seek greater architectural flexibility.

Edge advocates stress lower latency and better bandwidth, both of which provide real benefits. However, local infrastructure costs include capital, staffing, remote management, and maintenance. The case is strong if the workload genuinely needs local processing but weak if it’s adopted just because it sounds strategic. Running workloads at the edge without real-time capabilities, sovereignty, or resilience is often just expensive infrastructure rather than true innovation.

That is why enterprise leaders should resist the temptation to frame edge as the next universal destination for workloads. It is not. Some apps fit in centralized cloud regions, some belong in data centers, and others in localized facilities. The aim isn’t architectural purity but placement discipline. A helpful way to think about edge adoption in the next three to five years is to start with three questions:

  1. What decisions need to be made locally because of latency, safety, or user experience?
  2. What data should remain local because regulation, privacy, or economics make centralization a poor option?
  3. What operations must keep going even when connectivity to a centralized cloud is limited?

If a workload clearly benefits from one or more of those criteria, edge deserves serious consideration. If not, it probably fits better in a more centralized setup.

CIOs and architects should also avoid treating edge as a disconnected side project. The preferred model remains the integrated hybrid cloud. Centralized platforms are still ideal for data aggregation, long-term storage, model training, enterprisewide policies, and shared digital services. Edge is where execution occurs close to the source of interaction. More mature organizations will treat these as coordinated layers within one architecture, rather than opposing camps in an infrastructure debate.

The cloud market is evolving beyond the one-size-fits-all centralization model that characterized its early days. This is the cloud maturing. Smart cities, industrial systems, healthcare networks, telecom infrastructure, and low-latency digital services all point to the same truth: Proximity has become a crucial architectural factor that can no longer be overlooked.

Enterprises don’t need edge computing everywhere. They need a strategy for where it truly matters. The next stage of cloud architecture will reward organizations that recognize a simple truth: The most effective cloud is the one that intentionally distributes intelligence.

(image/jpeg; 0.91 MB)

On the pleasures and dangers of open source Python 27 Mar 2026, 2:00 am

Announced at JavaOne, Project Detroit proposes to break down the walls between Java, Python, and JavaScript. Also in this report: Better ways to instrument your code with Python’s new built-in sampling profiler, another run at using AI locally to rework a Python project, and the question on everyone’s mind right now (surely): What does OpenAI really want with Astral?

Top picks for Python readers on InfoWorld

OpenAI buys Python tools builder Astral
Astral, the maker of uv, ty, and pyx, has a new home under the OpenAI umbrella. Is OpenAI demonstrating its commitment to maintaining tooling in the AI space, or is the purchase more of a power move?

I ran Qwen3.5 locally instead of Claude Code. Here’s what happened
Want to run an LLM on your own hardware for that at-home Claude Code or Copilot experience? You can, but it’ll be a bumpy ride. My takeaway? Maybe don’t let the AI run around unsupervised after dark.

Hands-on with the new sampling profiler in Python 3.15
Among Python 3.15’s best new features is a sampling profiler. See how it works in this guide to using the profiler to instrument your code and find bottlenecks with minimal performance impact.

Project Detroit, bridging Java, Python, JavaScript, moves forward
The once-dead, now-revived Detroit project aims to allow Java’s Foreign Function and Memory API to talk seamlessly to other language runtimes. The vision? More powerful mixing and matching of languages across domains.

More good reads and Python updates elsewhere

The slow collapse of MkDocs
The strange, ongoing saga of how a developer meltdown took out one of the most popular documentation tools for Python—with no clear successor in sight.

Comparing the typing spec conformance of Python type-checking tools
How well do tools like Pyright, Pyrefly, Mypy, Ty, and others conform to Python’s own type annotation specs? The answers range, surprisingly, from “very closely” to “just barely.”

The optimization ladder: All the ways to make Python faster
From replacing the runtime to integrating modules written in C or Rust, here’s an end-to-end rundown of ways to speed up Python for tasks that urgently need performance.

License laundering and the death of ‘clean room’
When someone rewrote a long-unmaintained Python library with an LLM, the original developer broke a decade-plus silence to object. What are the implications for open source?

(image/jpeg; 6.05 MB)

Context Hub vulnerable to supply chain attacks, says tester 26 Mar 2026, 8:25 pm

On the surface, the recent critique of a new tool called Context Hub by a developer who created an open-source alternative appears to be an illustration that the tool is vulnerable to misuse. But delve further and it serves as a far greater warning to AI developers of the downside of using non-authoritative sources of information.

Two weeks ago, Andrew Ng, founder of a Silicon Valley technical training firm called DeepLearning.AI, launched the product, which he stated in a LinkedIn post is an open tool that gives a coding agent the up-to-date API documentation it needs.

“Install it and prompt your agent to use it to fetch curated docs via a simple CLI,” the post reads. “Why this matters: Coding agents often use outdated APIs and hallucinate parameters. For example, when I ask Claude Code to call OpenAI’s GPT-5.2, it uses the older chat completions API instead of the newer responses API, even though the newer one has been out for a year. Context Hub solves this. Context Hub is also designed to get smarter over time.”

According to Ng, using Context Hub, agents can even annotate docs with notes. “If your agent discovers a workaround, it can save it and doesn’t have to rediscover it next session,” he said. “Longer term, we’re building toward agents sharing what they learn with each other, so the whole community benefits.”

Poisoning the project

However, on Wednesday, Mickey Shmueli, the developer of LAP, which he described as an “open source alternative to Context Hub,” released a Context Hub supply chain attack Proof of Concept (PoC) on Github.

He explained the problem he’d discovered: Context Hub contributors submit docs as GitHub pull requests, maintainers merge them, and agents fetch the content on demand, “[but] the pipeline has zero content sanitization at every stage”

He wrote that the project “[has] published more than 1,000 API documents, and added a feature letting agents annotate docs for other agents. We tested whether a poisoned document in that registry could silently compromise developer projects.”

In the test, he wrote, “we created realistic poisoned docs containing fake dependencies and served them through an … MCP server inside isolated Docker containers.” He emphasized, however, that no poisoned content was uploaded to Context Hub’s registry; the tests were run locally on an MCP server configured to serve pre-built output from disk. But from the agent’s perspective, the experience was identical to fetching docs from the live registry.

The result: “When AI coding assistants fetched the docs, [Claude] Haiku silently wrote the fake package into requirements.txt in 100% of runs without ever mentioning it in its text output. A developer reading the assistant’s response would see nothing suspicious, but their project is poisoned.”

Only Claude Haiku, Sonnet, and Opus were tested; Opus fared best, Haiku worst. Results for other models such as GPT-4, Gemini, and Llama may differ, Shmueli noted.

Agentic AI likened to ‘high-speed idiot’

Responding to Shmueli’s findings, David Shipley, CEO of Beauceron Security, said Thursday, “[it is] time to have a moment of pure honesty about agentic AI. At its best, it’s a gullible, high-speed idiot that occasionally trips on hallucinogenic mushrooms that you’re giving the ability to act on your behalf. Stop and think about that. Would you knowingly hire a human that fit that description and then give them unsupervised access to code or your personal banking?  I wouldn’t.”

LLM-based generative AI tools, he said, “do not have the capacity for critical thought or reasoning, period. They’re probability math and tokens. They’re faking reasoning by retuning and iterating prompts to reduce the chances of being wrong.” 

That is not critical thinking, Shipley said, noting, “what was true in the 1950s remains true today: Garbage in, garbage out.”

People, he said, “built stochastic parrots that can be manipulated by sweet talking to them, and they call it prompt engineering. Dudes, it’s social engineering. And the more the AI industry keeps telling us about the Emperor’s New Clothes, the dumber we all look for believing them.”

Supply chain attacks a ‘serious and scalable threat’

Justin St-Maurice, technical counselor at Info-Tech Research Group, echoed Shipley’s concerns. He noted, “supply chain attacks are a serious and scalable threat, and what we’re seeing this week is a good example of why. The vulnerability isn’t necessarily in the application itself. It’s in the dependency chain, the shared libraries, the package repositories, all the common infrastructure these systems are built on top of.”

He added, “we’ve seen this pattern before, many times. A single flaw gets introduced upstream, and suddenly a huge range of downstream systems are exposed, often before anyone has caught it. What’s different now is the speed at which AI-assisted development is moving. Developers are pulling in shared dependencies, using AI-generated code, and moving fast. If something gets introduced into one of those common sources, it can propagate across a wide range of systems very quickly.”

And in an AI context, said St-Maurice, “the impact isn’t just passive. These systems can consume those inputs and act on them, which makes the potential impact a lot bigger.”

He noted, “the LiteLLM situation and what’s happening with Context Hub are two examples in the same week. It’s definitely worth paying attention to. Vibe coders and people building quickly on top of AI tools need to think seriously about how they’re validating dependencies and managing upstream risk. Relying on prompts alone won’t be enough to manage security risks.”

(image/jpeg; 1.67 MB)

Page processed in 0.265 seconds.

Powered by SimplePie 1.3, Build 20180209064251. Run the SimplePie Compatibility Test. SimplePie is © 2004–2026, Ryan Parman and Geoffrey Sneddon, and licensed under the BSD License.