Or try one of the following: 詹姆斯.com, adult swim, Afterdawn, Ajaxian, Andy Budd, Ask a Ninja, AtomEnabled.org, BBC News, BBC Arabic, BBC China, BBC Russia, Brent Simmons, Channel Frederator, CNN, Digg, Diggnation, Flickr, Google News, Google Video, Harvard Law, Hebrew Language, InfoWorld, iTunes, Japanese Language, Korean Language, mir.aculo.us, Movie Trailers, Newspond, Nick Bradbury, OK/Cancel, OS News, Phil Ringnalda, Photoshop Videocast, reddit, Romanian Language, Russian Language, Ryan Parman, Traditional Chinese Language, Technorati, Tim Bray, TUAW, TVgasm, UNEASYsilence, Web 2.0 Show, Windows Vista Blog, XKCD, Yahoo! News, You Tube, Zeldman
Microsoft acquires Osmos to ease data engineering bottlenecks in Fabric | InfoWorld
Technology insight for the enterpriseMicrosoft acquires Osmos to ease data engineering bottlenecks in Fabric 7 Jan 2026, 3:27 am
Microsoft has acquired AI-based data engineering firm Osmos for an undisclosed sum, as part of its effort to reduce data engineering friction inside Fabric, its unified data and analytics offering, as enterprises continue to push analytics and AI projects into production.
Osmos’ technology, which applies agentic AI to turn raw data into analytics and AI-ready assets in OneLake, will help customers bypass the common challenges most enterprises face — spending more time around data preparation rather than analysis, Bogdan Crivat, corporate VP of Azure Data Analytics, wrote in a blog post.
In fact, Roy Hasson, senior director of product at Microsoft, in a separate social media post, pointed out that the Seattle-headquartered startup had launched their AI data wrangler and AI data engineering agents on Microsoft Fabric as a native app almost two years ago, and they became quite popular.
“We quickly realized that customers loved using Osmos on top of Fabric Spark, and it reduced their dev and maintenance efforts by >50%,” Hasson wrote.
The startup, before the acquisition, offered Osmos Data Agents for Microsoft Fabric, Osmos Data Agents for Databricks, and Osmos AI-Assist Suite (Uploaders, Pipelines, Datasets), which the company describes as a collection of AI-powered data ingestion and engineering tools that automate the process of bringing external, messy data into operational systems with minimal manual effort or coding.
What the Osmos deal means for enterprises
While Microsoft is yet to divulge more details about the product roadmap for integrating Osmos’ technology in Fabric, analysts say that the integration is likely to help both CIOs and development teams.
For CIOs, the benefit would revolve around operational efficiency and faster time-to-value for analytics and AI initiatives, especially in environments constrained by data engineering talent and budget, said Robert Kramer, principal analyst at Moor Insights and Strategy.
Another upside of the acquisition for CIOs, according to Stephanie Walter, practice leader of the AI stack at HyperFRAME Research, is enabling data engineering automation that is governed, reversible, and auditable.
“As AI moves from experimentation to enterprise scale, this level of controlled automation becomes essential for maintaining reliability, compliance, and trust,” Walter said.
However, Kramer, in contrast to Walter, cautioned that enterprises’ dependence on Osmos’ technology for data engineering inside Fabric may increase platform dependence, raising governance and risk questions about certifying agentic pipelines, auditing and rolling back changes, and aligning autonomous data engineering with regulatory and compliance expectations.
Reducing repetitive engineering work for developers
For developers, though, Kramer pointed out, the acquisition has the potential to improve productivity by reducing repetitive and low-value engineering work around messy data.
“Tasks such as data wrangling, mapping inconsistent external feeds, pipeline scaffolding, and boilerplate Spark-style transformation code could be generated by agents rather than hand-built, allowing engineers to focus on architecture, performance, data quality, and guardrail design,” Kramer said.
“The development lifecycle could tilt toward reviewing, testing, and hardening AI-generated pipelines and transformations, with observability, approval workflows, and reversibility becoming core design requirements,” Kramer added.
Complementing recent Fabric enhancements
Analysts also view Osmos’ acquisition as complementing recent Fabric enhancements, including the introduction of Fabric IQ.
“As Fabric expands with IQ, new databases, and deeper OneLake interoperability, the limiting factor shifts from data access to data readiness. Osmos addresses that gap by automating ingestion, transformation, and schema evolution directly within the Fabric environment,” Walter said.
“In the context of Fabric IQ, Osmos helps ensure that the data feeding the semantic and reasoning layers remains continuously curated and stable as upstream sources change. Semantic systems only work when the underlying data is consistent and explainable, and Osmos is designed to reduce the operational friction that otherwise undermines those efforts,” Walter added.
But what about Osmos’ products and customers?
However, it is not all good tidings for existing Osmos customers as the company is winding down three offerings — Osmos Data Agents for Microsoft Fabric, Osmos Data Agents for Databricks, and Osmos AI-Assist Suite — as standalone products in January itself. This means that Osmos’ technology will only currently live inside Fabric, and customers who were users of the Databricks offering and the AI Suite will have to look at alternatives or find a way to work with Microsoft’s offerings.
Generative UI: The AI agent is the front end 7 Jan 2026, 1:00 am
The advent of Model Context Protocol (MCP) APIs hints at a coming era of agent-driven architecture. In this architecture, the chat interface becomes the front end and creates UI controls on the fly. Welcome to “generative UI.”
The new portlet
Once upon a time, the portal concept promised to give every user a personalized view. Finally, the promise of a web where the user was also the controller could be realized. That didn’t quite work out the way we thought. But today, generative UI proposes to take UI personalization to a whole new level, by marrying bespoke, as-needed UI components with agentic MCP APIs.
The old way involved writing back ends that provide APIs to do things and writing user interfaces that allow humans to easily take action on those APIs. The new idea is that we’ll provide MCP definitions that allow agents to take actions on the back end, while the front end becomes a set of definitions (like Zod schemas) that expose these capabilities.
One of the greatest things about having observed the industry over a long stretch of time is a healthy skepticism. You’ve seen so many things arise and promise the moon. Sometimes they crash and burn. Sometimes they become important. If they are useful, they are absorbed into the developer’s toolkit.
This skepticism isn’t even a conscious thing anymore; it’s an instinctual reaction. When someone tells me that AI is going to produce user interfaces on the fly as needed, I immediately begin raising objections. Like performance and accuracy.
Then again, the overall impact of AI on development has been significant, so let’s take a closer look.
Hands-on with generative UI
I’m thinking about generative UI as a kind of evolution of the managed agentic environment (like Firebase Studio). With an agentic IDE, you can rapidly prototype UIs by typing in a description of what you want. GenUI is a logical next bridge, where the user prompts the hosted chatbot (like ChatGPT) to produce UI components that the user can interact with on the fly.
In a sense, if an AI tool, even something like Gemini or ChatGPT with Code Live Preview active, becomes powerful enough, it will push the person using it to wear the user hat, rather than the developer hat. We’ll probably see that occur gradually, where we eventually spend more time designing rather than coding, diving into developer mode only when things break or become more ambitious.
To get hands-on with this idea, we can look at Vercel’s GenUI demo (or the Datastax mirror), which implements the streamUI function:
The
streamUIfunction allows you to stream React Server Components along with your language model generations to integrate dynamic user interfaces into your application. Learn more about thestreamUIhook from Vercel AI SDK.
Vercel’s GenUI demo will give you a taste of what is meant by on-the-fly UI components streamed alongside chat interaction:

Foundry
This is just a demo and it does the job of getting across the idea. It also exhibits plenty of typical AI foolishness and limitations. For instance, when I ask to buy “some Solana” in a stock buying chat it replies “Invalid amount.” So then I ask to buy “10 Solana” and it gives me a simple control with a Purchase button.
Of course, this is all for play, and there is no plumbing backing up that purchase. Creating that plumbing would be non-trivial (wiring up a wallet or bank account and all the attendant auth work).
But my purpose is not really to fault-find the demo. Some of the issues can be cleaned up with concerted developer work. Others are down to current limitations of large language models. By that I mean, there is a strange collision between the initial feeling of vast potential you get when using an AI or agentic tool and the hangover period of frustration that follows, when you suddenly find yourself with a mountain of AI-initiated “work” that will require hours of human concentration to master and wrangle.
It’s like you had a bit too much coffee and the caffeine wore off. Now you’ve got to roll up your sleeves and wrestle all of the big ideas into functioning software.
Vercel’s is not the only generative UI demo we can look at. Here’s another from Thesys:

Foundry
Microsoft’s AG-UI offers similar capabilities.
Is generative UI a good idea?
But let’s imagine that the genUI APIs and LLMs progress beyond their current state, and developers aren’t left with the heavy lifting. The main question is: Is a generative UI something we as human beings would ever actually want to use?
To be fair, the Vercel genUI is an API that is for use in other apps. That is to say, it allows you to stream UI components into the flow of AI responses. So maybe integrating on-demand React components via the streamUI API could really be just the thing in the right setting of another, well-considered UI.
It seems like a good UI with good UX is still the lion’s share of what people will use. I mean, I might sometimes want to ask my AI to find good deals on a flight to Kathmandu and then have it pop up an interface for buying the ticket, but usually I will just go to Expedia or whatever.
Even if you could perfectly express intention as a UI, when you finally do get a perfectly useful UI, you probably won’t want to continue to modify it in deep ways. You will want to save it and reuse it.
Typing out intention in English (or Hindi or German) is great for certain things, especially researching, brainstorming, and working through ideas, but the visual UI has huge advantages of its own. After all, that’s why Windows supplanted DOS for many uses.
But I hasten to add I’m not dismissing the idea out of hand. Perhaps some hybrid of designed UI along with chatbot prompt that can modify it on the go is in the cards.
An essential insight here is that if the web becomes a cloud of agentic endpoints, a realm of MCP (or similar) capabilities that give action to AI, then it will be a kind of marketplace of possible actions we can take using a neutral language interface. And the on-demand, bespoke UI component will become an almost inevitable element of that landscape.
Instead of a vast collection of documents and data, the web would be a collection of actions that could be taken based on intention and meaning.
Of course, the semantic web was supposed to make a web of meaning, but with AI a semantic web could be more practical. GenUI would be a new kind of way to provide tool definitions for engaging with that web.
Context architects
There is something here, but I don’t see genUI replacing UX and UI engineers anytime soon. Augmenting them, perhaps. Providing them with a new tool, maybe.
Similar to vibe coding, the idea that we’ll spend our time “architecting a context” using AI, rather than building interfaces, likely contains some of the character of the coming world of front-end development, but not the whole story.
The work of a UI developer in this model would consist of providing interface definitions that mediate between the chatbot and MCP servers. These definitions might look something like the snippet below. Vercel’s API uses Zod. This is just a pseudo-example:
// This Zod schema acts as the "Interior Interface" for the AI Agent
const cryptoPurchaseTool = {
description: 'Show this UI ONLY when the user explicitly asks to buy',
parameters: z.object({
coin: z.enum(['SOL', 'BTC', 'ETH']),
amount: z.number().min(0.1).describe('Amount to buy'),
}),
generate: async ({ coin, amount }) => {
// The AI plays within this sandbox
return
}
}
In a sense, this schema becomes the “interior UI” available to the AI, and the AI itself becomes a kind of universal human-machine intermediary. Is this really where we’re going? Only time will tell.
What the loom tells us about AI and coding 7 Jan 2026, 1:00 am
In the early 19th century, the invention of the loom threatened to turn the labor market upside down. Until then, cloth was made by skilled artisans, but the loom enabled more cloth to be made more quickly by less-skilled workers. One could even argue that the Jacquard loom, a loom that allowed for complex weaving patterns via punch cards, was the first computer.
This technology had a disruptive effect on the labor market and gave rise to the Luddites, a group who would physically destroy looms in factories. Jobs were lost, wages were depressed, and working conditions became more unpleasant. The loom led to social upheaval and drastic change in the short run.
But in the long run, the benefits were many. Making textile workers more productive meant more and better clothing for everyone. The advances in textile production were a harbinger of capital accumulation, economies of scale, and complementary innovations in many other areas as well, and the Industrial Revolution began.
This is a roundabout way of saying that I can’t stop writing about AI and coding.
Looming apocalypse
Just as the loom worried individual weavers, large language models (LLMs), coding agents, and the accompanying tools are the cause of considerable concern for software developers—and rightly so. We are starting to see shifts in hiring trends, with a decrease in the number of junior developers being hired. I’ve written before about our ongoing need for junior developers, if for no other reason than there will eventually be no senior developers without them, but the impact on the labor market cannot be ignored.
Shifts are definitely happening, and it remains to be seen what the effects will be. While we are a more sophisticated economy than the one faced by the Luddites, the ultimate effects of AI on software developers remain unclear. If AI writes all the code, what, exactly, will junior developers do? Juniors learn from doing the work of executing on design. Without that, how will they grow? How will they learn what they need to know to be senior developers? The short-term impact is something to worry about, however unclear it may be.
Nevertheless, I’m not afraid to make a few predictions about the long term.
If you were to tell a new parent in 1820 that their child was going to grow up to be a telegraph operator, a train conductor, or a professional photographer, they would have looked at you like you had two heads. Those jobs were unheard of at the time. Never mind if you told them their grandchildren would be pilots, radio operators, or movie producers.
Unknowable possibilities
The future is just as unknowable for our children. AI will open up new horizons and enable new technologies that we simply can’t predict. Most likely, my grandson will have a job title that doesn’t yet exist and that will astonish me in my dotage.
Even as recently as 25 years ago, I don’t think anyone could have conceived of many services we take for granted today. Services like Uber or Doordash—the melding of mobile technology, GPS, and advanced broadband—came about through a magical confluence of a hodgepodge of technologies. AI coding will likely be a part of other technologies that we don’t foresee. It seems likely that AI will enable us to build new things that will make AI even more powerful and capable.
There is no doubt that AI and LLMs will make the development of software more productive and take it to new, more sophisticated levels. What AI will enable isn’t knowable by us. But what we can be sure of is that this new technology will, as technology always does, combine with human ingenuity to create something amazing and mind-boggling.
There is no doubt about it. There will be jobs and technologies that will be commonplace in 2050 but aren’t yet even a twinkle in our eyes today.
AI won’t replace human devs for at least 5 years 6 Jan 2026, 10:14 pm
Human coders may have a temporary reprieve from losing their jobs to AI: It will be between five and six years before we reach full coding automation, according to a new report from LessWrong. This pushes back the online community’s previous predictions that the milestone would be reached much sooner, between January 2027 and September 2028.
The extended timeline comes just eight months after LessWrong’s initial findings, underscoring the precarious, subjective, ever-shifting nature of AI forecasting.
“The future is uncertain, but we shouldn’t just wait for it to arrive,” the researchers wrote in a report on their findings. “If we try to predict what will happen, if we pay attention to the trends and extrapolate them, if we build models of the underlying dynamics, then we’ll have a better sense of what is likely, and we’ll be less unprepared for what happens.”
Building a more nuanced model
According to LessWrong’s new AI Futures Model, AI will reach the level of “superhuman coder” by February 2032, and could ascend to artificial superintelligence (ASI) within five years of that. A superhuman coder is an AI system that can run 30x as many agents as an organization has human engineers with 5% of its compute budget. It works autonomously at the level of a top human coder, performing tasks in 30x less time than the organization’s best engineer, the researchers explained.
This new revelation pushes the timeline out 3.5 to 5 years farther than in LessWrong’s initial forecast in April 2025. This, it said, is the result of numerous reconsiderations, reframings, and shifting research strategies.
Notably, the researchers were “less bullish” on speedups in AI R&D, and relied on a new framework for a software intelligence explosion (SIE) — that is, whether AI is more rapidly improving its capabilities without the need for more compute, and how quickly that may be occurring. They also focused more heavily how well AI can set research direction and select and interpret experiments.
The LessWrong researchers analyzed several modeling approaches, eventually settling on “capability benchmark trend extrapolation,” which uses current performance trends and standardized tests to predict future AI capabilities. They estimated artificial general intelligence (AGI)-required compute using METR’s time horizon suite, METR-HRS.
“Benchmark trends sometimes break, and benchmarks are only a proxy for real-world abilities, but… METR-HRS is the best benchmark currently available for extrapolating to very capable AIs,” the researchers wrote.
But while the model pulled heavily from the METR graph, the researchers also adjusted for several other factors.
For instance, compute, labor, data, and other AI inputs won’t continue to grow at the same rate; there’s a “significant chance” they will slow due to limits in chip production, energy resources, and financial investments.
The researchers estimated a one-year slowdown in parameter updates and a two-year slowdown in AI R&D automation due to diminishing returns in software research; they ultimately described the model as “pessimistic” in this area. They also projected slower growth in the leading AI companies’ compute amounts and in their human workforce.
Further, they built the model to be less “binary,” in the sense that it gives a lower probability to very fast or very slow takeoffs. Instead, it computes increases and assumes incremental progress.
“The model takes into account what we think are the most important dynamics and factors, but it doesn’t take into account everything,” the researchers noted. At the end of the day, they analyzed the results and made adjustments “based on intuition and other factors.”
Ultimately, they acknowledged, “we don’t think this model, or any other model, should be trusted completely.”
Incremental steps to AGI
Artificial general intelligence (AGI) is typically understood as AI that has human-level cognitive capabilities and can do nearly everything humans can. But instead of making the full leap from human intelligence to AI to AGI, the LessWrong researchers break the evolution into distinct steps.
The superhuman coder, for instance, will quickly make way for the “superhuman AI researcher” that can fully automate AI R&D and make human researchers obsolete. That will then evolve to a “superintelligent AI researcher,” representing a step-change where AI outperforms human specialists 2x more than the specialists outperform their median researcher colleagues.
Beyond that is top-human-expert-dominating AI, where AI can perform as well as human specialists on nearly all cognitive tasks and ultimately replaces 95% of remote work jobs.
Lastly comes artificial superintelligence (ASI), another step-change where models perform much better than top humans at virtually every cognitive task. The researchers anticipate ASI will occur five years after superhuman coding capabilities are achieved.
“AGI arriving in the next decade seems a very serious possibility indeed,” noted LessWrong researcher Daniel Kokotajlo. He and his colleagues split their model progress into stages, the last approaching the understood limits of human intelligence. “Already many AI researchers claim that AI is accelerating their work,” they wrote.
But, they added, “the extent to which it is actually accelerating their work is unfortunately unclear.” Likely, it is a “nonzero,” but potentially very small, impact that could increase as AI becomes more capable. Eventually, this could allow AI systems to outperform humans at “super exponential” speeds, according to the researchers, introducing yet another factor for consideration.
What this means for enterprises
The altered timeline is an “important signal” for enterprises, noted Sanchit Vir Gogia, chief analyst at Greyhound Research. It shows that even sophisticated models are “extremely sensitive” to assumptions about feedback loops, diminishing returns, and bottlenecks.
“The update matters less for the year it lands on and more for what it quietly admits about how fragile forecasting in this space really is,” he said.
Benchmark-driven optimism must be handled with care, he emphasized. While time horizon style benchmarks are useful indicators of progression, they are “poor proxies” for enterprise readiness.
From a CIO perspective, this isn’t a disagreement about whether AI can code; that debate is over, said Gogia. Enterprises should be using AI “aggressively” to compress cycle times while keeping humans accountable for outcomes. To this end, he is seeing more bounded pilots, internal tooling, gated autonomy, and strong emphasis on auditability and security.
It is also critical to correct the “mental model” for the next two to three years, Gogia noted. The dominant shift will not be to fully autonomous coding, but to AI-driven acceleration of processes across the enterprise. “Value will come from redesigning workflows, not from removing people,” he said. “The organizations that succeed will treat AI as a force multiplier inside a disciplined delivery system, not as a replacement for that system.”
Ultimately, repeatable results will reveal whether AI systems can handle complex, multi-repository, long-lived software that doesn’t require constant human rescue, Gogia said. “Until then, the responsible enterprise stance is neither dismissal nor blind belief, it is preparation.”
Automated data poisoning proposed as a solution for AI theft threat 6 Jan 2026, 9:06 pm
Researchers have developed a tool that they say can make stolen high-value proprietary data used in AI systems useless, a solution that CSOs may have to adopt to protect their sophisticated large language models (LLMs).
The technique, created by researchers from universities in China and Singapore, is to inject plausible but false data into what’s known as a knowledge graph (KG) created by an AI operator. A knowledge graph holds the proprietary data used by the LLM.
Injecting poisoned or adulterated data into a data system for protection against theft isn’t new. What’s new in this tool – dubbed AURA (Active Utility Reduction via Adulteration)– is that authorized users have a secret key that filters out the fake data so the LLM’s answer to a query is usable. If the knowledge graph is stolen, however, it’s unusable by the attacker unless they know the key, because the adulterants will be retrieved as context, causing deterioration in the LLM’s reasoning and leading to factually incorrect responses.
The researchers say AURA degrades the performance of unauthorized systems to an accuracy of just 5.3%, while maintaining 100% fidelity for authorized users, with “negligible overhead,” defined as a maximum query latency increase of under 14%. They also say AURA is robust against various sanitization attempts by an attacker, retaining 80.2% of the adulterants injected for defense, and the fake data it creates is hard to detect.
Why is all this important? Because KGs often contain an organization’s highly sensitive intellectual property (IP), they are a valuable target.
Mixed reactions from experts
However, the proposal has been greeted with skepticism by one expert and with caution by another.
“Data poisoning has never really worked well,” said Bruce Schneier, chief of security architecture at Inrupt Inc., and a fellow and lecturer at Harvard’s Kennedy School. “Honeypots, no better. This is a clever idea, but I don’t see it as being anything but an ancillary security system.”
Joseph Steinberg, a US-based cybersecurity and AI consultant, disagreed, saying, “in general this could work for all sorts of AI and non-AI systems.”
“This is not a new concept,” he pointed out. “Some parties have been doing this [injecting bad data for defense] with databases for many years.” For example, he noted, a database can be watermarked so if it is stolen and some of its contents are later used – a fake credit card number, for example — investigators knows where that piece of data came from. Unlike watermarking, however, which puts one bad record into a database, AURA poisons the entire database, so if it’s stolen, it’s useless.
AURA may not be needed in some AI models, he added, if the data in the KG isn’t sensitive. The real unanswered question is what the real-world trade-off between application performance and security would be if AURA is used.
He also noted that AURA doesn’t solve the problem of an undetected attacker interfering with the AI system’s knowledge graph, or even its data.
“The worst case may not be that your data gets stolen, but that a hacker puts bad data into your system so your AI produces bad results and you don’t know it,” Steinberg said. “Not only that, you now don’t know which data is bad, or which knowledge the AI has learned is bad. Even if you can identify that a hacker has come in and done something six months ago, can you unwind all the learning of the last six months?”
This is why Cybersecurity 101 – defense in depth – is vital for AI and non-AI systems, he said. AURA “reduces the consequences if someone steals a model,” he noted, but whether it can jump from a lab to the enterprise has yet to be determined.
Knowledge graphs 101
A bit of background about knowledge graphs: LLMs use a technique called Retrieval-Augmented Generation (RAG) to search for information based on a user query and provide the results as additional reference for the AI system’s answer generation. In 2024, Microsoft introduced GraphRAG to help LLMs answer queries needing information beyond the data on which they have been trained. GraphRAG uses LLM-generated knowledge graphs to improve performance and lower the odds of hallucinations in answers when performing discovery on private datasets such as an enterprise’s proprietary research, business documents, or communications.
The proprietary knowledge graphs within GraphRAGs make them “a prime target for IP theft,” just like any other proprietary data, says the research paper. “An attacker might steal the KG through external cyber intrusions or by leveraging malicious insiders.”
Once an attacker has successfully stolen a KG, they can deploy it in a private GraphRAG system to replicate the originating system’s powerful capabilities, avoiding costly investments, the research paper notes.
Unfortunately, the low-latency requirements of interactive GraphRAG make strong cryptographic solutions, such homomorphic encryption of a KG, impractical. “Fully encrypting the text and embeddings would require decrypting large portions of the graph for every query,” the researchers note. “This process introduces prohibitive computational overhead and latency, making it unsuitable for real-world use.”
AURA, they say, addresses these issues, making stolen KGs useless to attackers.
AI is moving faster than AI security
As the use of AI spreads, CSOs have to remember that artificial intelligence and everything needed to make it work also make it much harder to recover from bad data being put into a system, Steinberg noted.
“AI is progressing far faster than the security for AI,” Steinberg warned. “For now, many AI systems are being protected in similar manners to the ways we protected non-AI systems. That doesn’t yield the same level of protection, because if something goes wrong, it’s much harder to know if something bad has happened, and its harder to get rid of the implications of an attack.”
The industry is trying to address these issues, as the researchers observe in their paper. One useful reference, they note, is the US National Institute for Standards and Technology (NIST) AI Risk Management Framework that emphasizes the need for robust data security and resilience, including the importance of developing effective KG protection.
This article originally appeared on CSOonline.
Ruby 4.0.0 introduces ZJIT compiler, Ruby Box isolation 6 Jan 2026, 3:45 pm
Ruby 4.0.0 has arrived as the newest release of the interpreted, object-oriented Ruby programming language. The update features a new just-in-time compiler, ZJIT, and an experimental “Ruby Box” capability for in-process separation of classes and modules.
Released on December 25, 2025, Ruby 4.0.0 can be downloaded from ruby-lang,org.
Ruby Box is a new feature designed to provide separate spaces in a Ruby process for isolating code, libraries, and monkey code. Anticipated use cases for Ruby Box include running test cases in a box to protect other tests when the test case uses monkey patches for overriding something, running web app boxes in parallel for blue-green deployments on an app server in a Ruby process, and running web app boxes in parallel to evaluate dependency updates for a specific time period by checking response diffs. Note that Ruby Box is currently experimental and comes with a few known issues.
Ruby 4.0.0 also introduces ZJIT, a new just-in-time compiler intended to be the next generation of YJIT. Built into Ruby’s YARV reference implementation, ZJIT is faster than the interpreter, but not yet as fast as YJIT. Developers are encouraged to experiment with ZJIT, but maybe hold off on deploying it in production for now. Users are advised to stay tuned for Ruby 4.1 ZJIT.
Also in Ruby 4.0.0, Ruby’s parallel execution mechanism, Ractor, has received improvements including a new class, Ractor:port, to address issues pertaining to message sending and receiving, and Ractor.shareable_proc, to make it easier to share Proc objects between Ractors. For performance, many internal data structures in Ractor have been improved to reduce contention on a global lock, thus resulting in better parallelism. Ractors now also share less internal data, resulting in less CPU contention when running in parallel.
Ruby first emerged in 1995. Other features in Ruby 4.0.0 include the following:
*nilno longer callsnil.to_a, similar to how**nildoes not callnil.to_hash.- For core classes,
Array#rfindhas been added as a more efficient alternative toarray.reverse_each.find. Enumerator.producenow accepts an optionalsizekeyword argument to specify the enumerator size.Kernel#inspectnow checks for the existence of an#instance_variables_to_inspectmethod, allowing control over which instance variables are displayed in the#inspectstring.
Open WebUI bug turns the ‘free model’ into an enterprise backdoor 6 Jan 2026, 3:28 am
Security researchers have flagged a high-severity flaw in Open WebUI, a self-hosted enterprise interface for large language models, that allows external model servers connected via its Direct Connections feature to inject malicious code and hijack AI workloads.
The issue, tracked as CVE-2025-64496, stems from unsafe handling of server-sent events (SSE), enabling account takeover and, in some cases, with extended permissions, remote code execution (RCE) on backend servers.
According to Cato CTRL findings, if an employee connects Open WebUI to an attacker-controlled model endpoint, like under the pretext of a “free GPT-4 alternative”, the frontend can be tricked into silently executing injected JavaScript. That code steals JSON Web Tokens (JWTs) from the browser context, offering attackers persistent access to the victim’s AI workspace, documents, chats, and embedded API keys.
The bug impacts Open WebUI versions up to 0.6.34 and is fixed in v0.6.35, with enterprises urged to patch production deployments without delay.
Convenience feature turned into a crisis
Cato researchers said the problem is Direct Connections, a feature intended to let users connect Open WebUI to external, OpenAI-compatible model servers. The platform’s SSE handler trusts incoming events from these servers, especially those tagged as “{type: execute},” and executes their payload via a dynamic JavaScript constructor.
When a user connects to a malicious server, easily enabled through social engineering, that server can stream an SSE with executable JavaScript. That script runs with full access to the browser’s storage layer, including the JWT used for authentication.
“Open WebUI stores the JWT token in localStorage,” Cato researchers said in a blog post. “Any script running on the page can access it. Tokens are long-lived by default, lack HttpOnly, and are cross-tab. When combined with the execute event, this creates a window for account takeover.”
The attack requires the victim to enable Direct Connections (disabled by default) and add the attacker’s malicious model URL, according to an NVD description.
Escalating to Remote Code Execution
The risk doesn’t stop at account takeover. If the compromised account has workspace.tools permissions, attackers can leverage that session token to push authenticated Python code through Open WebUI’s Tools API, which executes without sandboxing or validation.
This turns a browser-level compromise into full remote code execution on the backend server. Once an attacker gets Python execution, they can install persistence mechanisms, pivot into internal networks, access sensitive data stores, or run lateral attacks.
The flaw received a high severity rating at 8/10 base score by NVD, and a 7.3/10 base score by GitHub. The flaw was rated high rather than critical, reflecting the fact that exploitation requires the Direct Connections feature to be enabled and hinges on a user first being lured into connecting to a malicious external model server. Patch mitigation in Open WebUI v0.6.35 involves blocking “execute” SSE events from Direct Connections entirely, but any organization still on older builds remains exposed. Additionally, the researchers advised moving authentication to short-lived and HttpOnly cookies with rotation. “Pair with a strict CSP and ban dynamic code evaluation”, they added.
What drives your cloud security strategy? 6 Jan 2026, 1:00 am
Consider a fictitious company, DeltaSite, and an all-too-common scenario for rapidly expanding SaaS providers. Within months, DeltaSite embarked on an ambitious multicloud migration, deploying critical workloads across AWS, Azure, and Google Cloud. DeltaSite’s board approved a seven-figure investment in the latest cloud security tools, including AI-powered monitoring and automated compliance frameworks, believing this would virtually guarantee security.
Yet just six months after going live, DeltaSite suffered a major breach: A single misconfigured storage bucket exposed sensitive customer data to the public internet. Despite their investment in advanced tools, the breach stemmed from a basic error that went unnoticed. This situation highlights a growing industrywide problem: Too many organizations place their confidence in technology while overlooking the foundational importance of skilled, well-trained cloud security talent.
Security incidents rise despite better tools
Cloud security incidents have spiked by 61% in 2025, with nearly two-thirds of organizations reporting at least one critical event. At first glance, it’s natural to blame the size and complexity of cloud environments or to scapegoat attackers using increasingly sophisticated techniques. But a closer look at the headlines and breach reports reveals a different pattern.
The root causes remain stubbornly familiar: persistent misconfigurations, compromised credentials, and the unchecked growth of shadow IT. These failures are not from a lack of technology. They stem from over-reliance on tools at the expense of building internal expertise. Automated scanners and dashboards identify risks, but without knowledgeable staff, the warnings go unheeded or misunderstood. This pattern is happening everywhere as companies race into multicloud adoption without corresponding investment in people.
The solution is talent, not tools
In the past five years, the supply of cloud security talent has sharply declined. The rush to the cloud created a talent bottleneck that hasn’t fully resolved. Instead of hiring skilled teams, organizations relied on AI-powered tools, yet human errors persist, with automation amplifying them rather than improving judgment. Misconfigurations cause data leaks and breaches, which attackers increasingly exploit using stolen credentials. Enterprises expand their cloud use, often outside IT oversight. The growth of shadow IT and new services makes configuration issues inevitable, which often go unaddressed by underqualified teams.
There is no shortage of high-caliber technology in today’s market. The promise of cloud security platforms is enticing. Dashboards can identify risk in real time, automated compliance frameworks map out vulnerabilities, and AI-driven anomaly detection is ready to outsmart the next would-be attacker. However, technology alone cannot compensate for staff inexperience, nor can it force good cloud hygiene on an organization that hasn’t invested in training.
The challenge today is not discovery; it is interpretation, governance, and follow-through. Real security comes from experienced practitioners who understand how cloud services interact, can investigate policy violations, and can adapt controls to changing regulatory and operational demands. Without this expertise, even the most advanced tools can only reveal information that remains unused or misunderstood. The headlines about misconfiguration-driven data leaks—like the one at DeltaSite—prove the point: Talent failure, not tool failure, leads to breaches.
Enterprise action steps
The status quo is unsustainable. Cloud incidents are surging, regulatory scrutiny is intensifying, and breach-related losses continue to mount. Enterprises must recognize that their most critical investment is not another dashboard but the continuous development of talent. The cycle of over-reliance on technology at the expense of people must be broken. Organizations should take concrete steps to address the rising threat of misconfigurations and prevent further erosion of cloud security resilience.
First, organizations must commit to ongoing, role-specific training for every cloud security professional. This goes beyond certificates; it requires time for learning, practice, and real-world problem-solving on evolving platforms.
Second, enterprises must build strong cross-departmental governance to ensure a single accountable authority for cloud adoption, configuration, and oversight. This limits shadow IT and focuses responsibility in the right places.
Third, companies should regularly invite outside consultants, not merely for one-time audits, but for collaborative engagements that transfer knowledge and bring best practices into the team.
Fourth, a culture of continuous improvement is essential; security incidents should trigger not only remediation but also structured post-incident reviews that provide feedback into team education and evolving processes.
Cloud security is now the most challenging aspect of digital modernization, and it will only get harder unless enterprise leaders rediscover the importance and true value of skilled talent. The escalating number of breaches, compliance failures, and regulatory actions proves that tools alone cannot fix what is primarily a people problem. The organizations that thrive in the coming years will be those that place skilled, curious, and well-supported practitioners at the core of their security strategy. In the end, the best investment is in people, not just products.
Generative AI and the future of databases 6 Jan 2026, 1:00 am
How must databases adapt to generative AI, and how should databases be integrated with large language models (LLMs)? These are questions that Sailesh Krishnamurthy has grappled with for several years now. As VP of engineering for databases at Google Cloud, Krishnamurthy leads the database team for Google Cloud and all Google services including Google Search and YouTube. He also leads a program to leverage generative AI and Google’s Gemini models in database management.
I recently talked with Krishnamurthy about the challenges of bringing databases and LLMs together, the difficulties of generating SQL from natural language, how the Google Cloud database team is addressing these problems, and how databases are evolving to meet the requirements of generative AI applications and their users. The following interview has been edited for brevity and clarity.
Martin Heller: Shall we talk about the disconnect between LLMs and operational data? How do we bridge this gap?
Sailesh Krishnamurthy: I think LLMs have an enormous amount of world knowledge and then I think the logical thing is for enterprises to combine it with data inside the enterprise. The easiest thing to deal with there is the information retrieval problem when you have a corpus of documents but the part that gets hard is when you have to go beyond documents and you have to extract the data out of the database and go stick it in some document and submit that to a search engine.
The data is heavily permissioned and has to be secure. We worry about exfiltration and access. The data is at the heart of your line of business application, but it is also changing all the time, and if you keep extracting the data into some other corpus it gets stale. You can view it as two approaches: replication or federation. Am I going to replicate out of the database to some other thing or am I going to federate into the database?
Heller: What should we understand about these two approaches, federation and replication?
Krishnamurthy: At some level federation and replication are duals of each other. You can do either thing, the question is what makes sense where. What we see people doing is using context and schemas and the representation of what data exists where LLMs and other orchestration mechanisms outside interrogate those data sources on the fly and provide answers. I think that’s the meta trend, and not just against databases but against many other proprietary systems.
Slack is a good example, or some other kind of system which offers an API to bridge that gap. I think there are many challenges there because when it comes to databases, as an industry we’ve tended to build microservices that have very narrow apertures into the database, and if you just make those microservices available to the LLM then all the LLM is able to see is that narrow window of data.
There is security because applications typically connect to the database through a service principal. The end user in the enterprise is a logged-in user to something like Gemini for Enterprise. If they want to connect to the database then that end user is different from the service principal that the agent connects to the database, so getting the security right is tricky. These are some of the problems that I tend to see. So how do you bridge the gap? It’s a big deal for us. We have a series of solutions.
Heller: Tell us about those solutions. What has Google come up with?
Krishnamurthy: You want to light up operational databases and make them available for agentic applications. There’s a sequence of options on the table. The simplest is that you enable tools that can take the data from the database and make them available for the LLM.
We built something called the MCP Toolbox for Databases. It’s open source. I never thought I’d be talking about being responsible for a viral hit, but it’s as viral as it gets for us. [Editor’s note: MCP Toolbox has 11.9K stars on GitHub.] It’s not just us. It supports our first-party databases, but it’s open. Many of our competitors have signed on to use the MCP Toolbox for Databases. It’s a very easy way to connect an LLM or an orchestration system of any sort that uses LLMs to these databases. It solves a whole bunch of problems around connectivity and security and so on. Now, in addition to the standard tools that come out of the box where you can go and interrogate the database, it provides the ability for users to add custom tools.
These custom tools are just scanned queries. You can think of them as templated queries with certain parameters and the LLM. To write the correct English language description of these queries turns out to be a non-trivial problem.
I think engineers know how to write good SQL queries. Whether they know how to write good English language description of the SQL queries is a completely different matter, but let’s assume for a second we can or we can have AI do it for us. Then the AI can figure out which tool to call for the user request and then generate the parameters. There are some things to worry about in terms of security. How can you set the right secure parameters? What parameters are the LLM allowed to set versus not allowed to set? But that’s the simplest way to solve the problem… to make it easy to connect LLMs to operational databases. This was the simplest version of the problem.
The second set of things that you can do is automatically generating SQL from natural language descriptions. It’s going to be very painful to write these custom tools but it makes sense to write them, because they fill the gaps. The APIs are too narrow but you don’t want to build a new API layer. I can go against the database with these custom tools, but if I want to do something fundamentally more open-ended, and I think the seductive promise of AI is to do the more open-ended things, then I need the ability to run more free form queries against the database, and we need to do that securely. So the problems we have to solve there are first about accuracy: How can we make sure we generate highly accurate queries. Our team just last week took the leaderboard position on the world’s leading benchmark on natural language to SQL. The science continues to improve in terms of how well you can do this.
Heller: What are these factors? How can SQL generation from natural language be improved?
Krishnamurthy: One part of it is context. The more you can tell the model about the schema and the metadata, the better it’s going to get. Certain things are more obvious than others. For instance, one of the examples I like to give is you may have a table of an e-commerce scenario and you may have a billing address and a shipping address column, and there’s some implicit information that if one of them is null then they are supposed to both be the same. That’s an implicit assumption by the people who designed the schema. If you know that assumption, then the SQL query against the system is going to be more accurate. So there is a set of things that we are doing to build better context in automatically extracting schemas, and build better information so that natural language to SQL works well.
There are also other things we’re doing, innovating right inside the model. This is the power of Google: We are at every level of the stack from the chips to the infrastructure to the models to the databases. We work closely with our colleagues at Google DeepMind to make the models themselves better. This combination of making the models better and providing the best possible context to the model results in better, more accurate queries.
The next step is also being secure. The challenge is, if the LLM can run any possible query against the database, then how do you make sure you don’t exfiltrate and leak information? We’ve built this technology that we call parameterized secure views in the database itself, that lets you define the right secure barriers and encodes the security policies that you need, so that the LLM can generate any query it wants, but with respect to the logged-in user we will not let them see any information that they are not supposed to see. We’ll also, on an information-theoretical basis, not leak information that they should not have access to.
Heller: I know you’ve spent a lot of time thinking about the future of databases and generative AI. Where are we headed?
Krishnamurthy: Part of my thinking here has evolved over the last couple of years, but for 50 years the world of databases has been at least SQL databases where it was all about producing exact results. I like to say databases had one job: store the data, don’t lose the data, and then when you ask a question, give the exact result. OK, maybe two jobs. It was all about exact results because we’re dealing with structured data. I think the biggest change that’s happening right now is that we are no longer just dealing with structured data. We’re also dealing with unstructured data. When you combine structured and unstructured data, the next step is that it’s not just about exact results but about the most relevant results. In this sense databases start to have some of the capabilities of search engines, which is about relevance and ranking, and what becomes important is almost like precision versus recall for information retrieval systems. But how do you make all of this happen? One key piece is vector indexing. In other words, you have structured data, which is in the database, but we have other kinds of information, unstructured data, semi-structured data.
Some of the unstructured data may live in database columns and rows, but many of them will live in object stores. It could be image data, video data, maybe geospatial data, PDFs. The data that needs to be in object stores will remain in object stores. But what you can do and what we’re seeing our customers do is to extract vector embeddings of the data in object stores and co-locate them with the database. What you can now do is ask questions that combine the structured and the unstructured data.
There are other ways people have tried to go at this problem, for instance to let the database do its own thing. We’ll build a custom separate vector store for it. That’s a path which is very hard to make work well because in practice you will have to stitch these systems together. Users want to just combine information.
A great example is Target.com. All of Target’s online catalog and the vector search for that has moved to AlloyDB. One of the key requirements they had was in addition to searching by image or searching by description the customer may also care about price, which is there in the database. It’s not in the object store. They may care about the inventory of the item in the physical store closest to the logged-in user.
Heller: Or maybe within 10 miles.
Krishnamurthy: Yeah, that becomes an additional parameter. It becomes a geospatial parameter predicate, right? But the point is, if you were to stitch this at the application level, it’s really hard to do, because on a per request basis the selectivity of these different predicates varies, so you don’t know whether it’s right.
If I use a separate vector index, I could probe it and get the first 10 results, and then I go to my regular database and when I apply all my other predicates it would get no results. That doesn’t help the business because Target wants to provide results that people are going to click and then purchase. If you go to the database first, you might get too many results to combine with a nearest neighbor vector search. So, what we have figured out is to have these different indexes in one system. We built this thing called adaptive filter vector search where we combine these index probes together, so we are always able to produce the best results in the lowest possible time and with the best quality. And the results are pretty amazing with Target.com. They were able to reduce the number of no results pages by 50% and improve business outcomes by about 20%.
And for me it was a very interesting realization because it meant it was a great example that the world had changed. My world of being in this industry where I only focus on exact results had changed and the new world is one where we are going back to how Google built spellcheck in search, which we all take for granted now but 20 or 25 years back it was an iterative improvement so the system became better and better over time. I think people who are building these new kinds of applications are going to be thinking fundamentally “I have to keep iterating, I have to keep improving.”
Heller: I think this brings us to the concept of AI-native databases, which you have talked about. What is an AI-native database from Google’s point of view?
Krishnamurthy: The AI-native database that we think about is one which has vector index hybrid search that combines vector search, full text search, and other kinds of structured search in one system. We have multimodal databases that are also emerging. For instance, at Google we have Spanner and we provide a graph interface in the data. People’s expectations have fundamentally changed: they want us to make all the data work together because they are no longer asking questions just in exact results. They’re asking very general-purpose questions and they want us to find ways to connect the data and provide the results.
AI search is a key attribute of an AI-native database. And the other key attribute is AI functions. We are starting to put in the ability to ask questions using LLM technology right inside the database. For instance, I might want to say we provide a SQL function called AI which will let you do something interesting with respect to a value, a field, a piece of text that’s already in a database column. You might have an interesting question about the brands you care about, brands that are only American.
If you think about the database, it’s storing structured information, but a lot of it is just textual information. It’s things that matter. You can provide all of that information in this AI-native database by making it offer a very rich set of primitives.
If I step back to where I was before, I talked about natural language search which made applications richer. These fit very well together because people ask questions against the database in natural language. If I’m able to go into an AI-native database, the quality of those results becomes more interesting and more valuable. If I step back, my answer to your question in the beginning is the simplest possible approach: custom tools which look much like the existing world, but we’ve disaggregated the application. We’ve chopped it up into little tools that are issuing queries, and we connect them to AI agents.
That’s the first step. The next step is to ask questions that I didn’t anticipate, and in order to do that, I need accuracy and security. We’ve made great strides on both of these. I’m not going to say we’ve solved those problems, because they are hard, but we’re really very excited about our progress. The third step is the actual database itself becoming more intelligent and more able to deal with different kinds of data.
C# wins Tiobe Programming Language of the Year honors for 2025 5 Jan 2026, 4:03 pm
Microsoft’s C# has won the Tiobe Index Programming Language of the Year designation for the second time in three years, with the largest year-over-year increase in ranking in the company’s programming language popularity index. Meanwhile, another Microsoft language, TypeScript, may crack the index’s top 20 this year, according to Tiobe CEO Paul Jansen.
Tiobe announced C# as its language of the year for 2025 on January 4. C# rose 2.94 percentage points year over year, with a rating of 7.39% and ranking of fifth this month. C#’s winning the award had been expected; the language was also Tiobe’s language of the year for 2023. “From a language design perspective, C# has often been an early adopter of new trends among mainstream languages,” wrote Paul Jansen, CEO of the software quality services vendor, in a bulletin accompanying the January 2026 index. “At the same time, it successfully made two major paradigm shifts: from Windows-only to cross-platform, and from Microsoft-owned to open source. C# has consistently evolved at the right moment.” Jansen added that he had expected C# to prevail against Java for dominance in the business software market, but the contest at this point remains undecided. “It is an open question whether Java—with its verbose, boilerplate-heavy style and Oracle ownership—can continue to keep C# at bay, said Jansen. Java was rated third in this month’s index behind Python and C, with an 8.71% rating.
Jansen concluded his bulletin with a prediction about TypeScript, Microsoft’s JavaScript with syntax for types. “I have a long history of making incorrect predictions, but I suspect that TypeScript will finally break into the top 20,” Jansen said. The language is currently ranked 32nd. “The reason why I think TypeScript will grow is because I see that a lot of front-end software (user interfaces) are written in TypeScript instead of JavaScript nowadays,” Jansen said. “The advantage of TypeScript over JavaScript is that it is type-safe.” If developers use TypeScript in the right way, it is much harder to shoot yourself in the foot, Jansen said. “Adopting TypeScript is without any risks because TypeScript compiles to JavaScript. Hence, you can always go back to JavaScript if you don’t like TypeScript.”
Tiobe’s monthly index ratings are based on a formula that assesses the number of skilled engineers worldwide, courses, and third-party vendors pertinent to a language. Popular websites such as Google, Amazon, Wikipedia, Bing, and more than 20 others are used to calculate the ratings. Elsewhere in the index for January 2026, Jansen said Go appears to have permanently lost its place in the top 10 last year. The same seems true for Ruby, which fell out of the top 20 and is unlikely to return anytime soon, according to Jansen. But Perl made a surprising comeback, rising from position number 32 in January 2025 to number 11 by the end of the year. Another resurgent language is R, which returned to the top 10 in 2025, driven largely by continued growth in data science and statistical computing, Jansen said. The C language, now ranked second, and C++, now in fourth place, swapped positions in the index over the past year.
The Tiobe index top 10 for January 2026:
- Python, 22.61%
- C, 10.99%
- Java, 8.71%
- C++, 8.67%
- C#, 7.39%
- JavaScript, 3.03%
- Visual Basic, 2.41%
- SQL, 2.27%
- Delphi/Object Pascal, 1.98%
- R, 1.82%
The alternative Pypl Popularity of Programming Language Index assesses language popularity based on a formula that analyzes how often language tutorials are searched on in Google.
The Pypl index top 10 for January 2026:
How to make AI agents reliable 5 Jan 2026, 1:00 am
It’s time to wake up from the fever dream of autonomous AI. For the past year, the enterprise software narrative has been dominated by a singular, intoxicating promise: agents. We were told that we were on the verge of deploying digital employees that could plan, reason, and execute complex workflows while we watched from the sidelines, which Lena Hall, Akamai senior director of developer relations, rightly pillories.
But if you look at what is actually shipping in the enterprise, the reality is starkly different.
Drew Breunig recently published a sober analysis that acts as a necessary corrective to the hype. Breunig synthesizes data from several reports, including a study, “Measuring Agents in Production,” and reveals a simple, inconvenient truth: The biggest obstacle to an agent-driven future is agentic unreliability.
In other words, most enterprise agents aren’t failing because the models aren’t smart enough; they’re failing because they aren’t boring enough.
The reliability gap
Breunig’s core finding is that successful production agents are “simple and short.” They don’t autonomously navigate the internet to solve open-ended problems. Instead, 68% of them execute fewer than 10 steps before handing control back to a human or concluding the task.
This aligns perfectly with what I’ve been calling AI’s trust tax. In our rush to adopt generative AI, we forgot that while intelligence is becoming cheap, trust remains expensive. A developer might be impressed that an agent can solve a complex coding problem 80% of the time. But a CIO looks at that same agent and sees a system that introduces a 20% risk of hallucination, data leakage, or security vulnerability into a production environment.
That 20% gap is the reliability gap. As Breunig notes, rational employees don’t adopt unreliable tools. They route around them.
Easier said than done. After all, the way generative AI works, we’re trying to build deterministic software on top of probabilistic models. Large language models (LLMs), cool though they may be, are non-deterministic by nature. Chaining them together into autonomous loops amplifies that randomness. If you have a model that is 90% accurate, and you ask it to perform a five-step chain of reasoning, your total system accuracy drops to roughly 59%.
That isn’t an enterprise application; it’s a coin toss—and that coin toss can cost you. Whereas a coding assistant can suggest a bad function, an agent can actually take a bad action.
The solution, counterintuitively, is to be less ambitious.
Lower ambitions mean greater success
Breunig argues that the path forward is to “deliberately constrain agent autonomy.” This is exactly right. In other words, we need to stop trying to build “God-tier” agents that can do everything and start building “intern-tier” agents that do one thing perfectly.
This brings us back to the concept of the “golden path,” something I’ve been writing about repeatedly.
We don’t want platform engineering teams to become insurmountable obstacles that only know the word “no” (that’s what the legal team is for—kidding!). Platform teams should build paved roads (golden paths) that make the right way to build software also the easiest way. For agents, this means creating standardized, governed frameworks where the blast radius is contained by design. A golden path for an agent might look like this:
- Narrow scope: The agent is authorized to perform exactly one function (e.g., “reset password” or “summarize JIRA ticket”), not “manage IT support.”
- Read-only by default: The agent can read data to answer questions but requires explicit human approval to write to a database or call an external API. This is key to building AI agents the safe way.
- Structured output: We stop relying on vibes and start enforcing schemas. The agent shouldn’t just chat; it should return structured JSON that can be validated by code before it triggers any action.
All good, but more is needed. I’m talking about how we handle agent memory.
Memory is a database problem
Breunig highlights “context poisoning” as a major reliability killer, where an agent gets confused by its own history or irrelevant data. We tend to treat the context window like a magical, infinite scratchpad. It isn’t. It is a database of the agent’s current state. If you fill that database with garbage (unstructured logs, hallucinated prior turns, or unauthorized data), you get garbage out.
If you want reliable agents, you need to apply the same rigor to their memory that you apply to your transaction logs:
- Sanitization: Don’t just append every user interaction to the history. Clean it.
- Access control: Ensure the agent’s “memory” respects the same row-level security (RLS) policies as your application database. An agent shouldn’t “know” about Q4 financial projections just because it ingested a PDF that the user isn’t allowed to see.
- Ephemeral state: Don’t let agents remember forever. Long contexts increase the surface area for hallucinations. Wipe the slate clean often.
My Oracle colleague Richmond Alake calls this emerging discipline “memory engineering” and, as I’ve covered before, frames it as the successor to prompt engineering or context engineering. You can’t just add more tokens to a context window to improve a prompt. Instead, you must create a “data-to-memory pipeline that intentionally transforms raw data into structured, durable memories: short term, long term, shared, and so on.”
The rebellion against robot drivel
Finally, we need to talk about the user. One reason Breunig cites for the failure of internal agent pilots is that employees simply don’t like using agents. A big part of this is what I call the rebellion against robot drivel. When we try to replace human workflows with fully autonomous agents, we often end up with verbose, hedging, soulless text, and it’s increasingly obvious to the recipient that AI wrote it, not you. And if you can’t be bothered to write it, why should they bother to read it?
This is why keeping a human in the loop isn’t just a safety feature; it’s a quality feature. It’s also how you start to bootstrap trust. You start with suggestion mode, then graduate to partial automation only where you have measured reliability. Unsurprisingly, then, the most successful agents described in the reports Breunig cites are those that augment human work rather than replace it. They act as a copilot (to borrow the Microsoft nomenclature) that drafts the email, writes the SQL query, or summarizes the report, but then pauses and asks a human: “Does this look right?”
The reliability is high because the human is the final filter. The trust tax is low because the human remains accountable.
The boring revolution
We are leaving the phase of AI magical thinking and entering the phase of AI industrialization. The headlines about artificial general intelligence and superintelligence are fun, but they are distractions for the enterprise developer. AI is all about inference now, or the application of models to specific, governed data.
As Breunig’s analysis confirms, the agents that will actually survive in the enterprise aren’t the ones that promise to do everything. They do a few things reliably, securely, and boringly. The cure, in short, is not to wait for GPT-6 as some AI panacea. The cure is boring engineering that constrains blast radius, governs state, measures reality, and earns trust, one small workflow at a time.
In the enterprise, “boring” is what scales.
6 incredibly hyped software trends that failed to deliver 5 Jan 2026, 1:00 am
“We are such stuff as dreams are made on,” says Prospero in The Tempest. While he was reflecting on humanity’s fleeting existence, he could just as well have been describing the tech industry’s fascination with shiny new objects that come and go.
Over the decades, countless software leaders have fallen into the age-old trap of chasing “the next big thing,” only to find themselves trying to fit a round peg into a square hole. “There are plenty of examples of this, where we jumped the gun and paid the price,” says Brian Fox, CTO of Sonatype, a provider of software supply chain security software.
A 2025 HostingAdvice.com survey found that most programming language migrations are driven by hype rather than proven outcomes. And, a MIT report recently noted that although 80% of enterprises have attempted generative AI pilots, only 5% of those pilots succeeded. The vast majority of those projects stalled, or failed to deliver benefits in production.
“As outlined in Amara’s Law, humans tend to ‘overestimate the impact of technology in the short term and underestimate the effect in the long run,’” says Derek Holt, CEO of Digital.ai, an AI-powered software delivery platform.
When the market is saturated with lofty excitement and surging VC interest, it’s human nature to fall for the headlines and lose a bit of sanity to FOMO in the process. But what happens next is usually shoved under the rug: embarrassing overinvestment, abandoned projects, unmet promises, and, ultimately, unfulfilled dreams. In the aftermath of a hollow hype wave, we’re often left with a handful of legitimate use cases, wavering support, and sometimes outright scams.
In this post, we’ll do some post-mortems on the biggest recent software trends that failed to deliver on their promises. Beyond indulging nostalgia, we’ll assess the impacts and pick apart the lessons learned from each cycle. Both skeptics and believers should keep these lessons in mind, especially in times like these, when optimism outpaces reality.
Blockchain
“Blockchain is a textbook case of overhype,” says Kyle Campos, chief technology and product officer at CloudBolt, the provider of a hybrid cloud management platform.
The immutable distributed ledger was supposed to usher in web 3.0 and transform countless industries. Although blockchain still powers cryptocurrency and decentralized finance via Bitcoin or Ethereum smart contracts, an en masse adoption of private blockchains by enterprises never materialized.
“I saw the insurance industry pour resources into blockchain,” says Campos. “But after substantial investments, most efforts were abandoned because the cost and complexity far outweighed the benefits.”
Others watched blockchain get replaced by simpler tech that just worked. “One supply chain project I saw was shelved after a year and replaced with a simple setup: [Apache] Kafka, signed records, and [Amazon] S3 immutability,” says Srikara Rao, CTO of cloud and cyber security services at R Systems, a global digital solutions provider. “It lacked the blockchain buzzword, but worked reliably and scaled.”
Blockchain simply didn’t fit many use cases. “Blockchain is fundamentally a very, very slow, expensive database,” says Liz Fong-Jones, field CTO at Honeycomb, an observability platform provider. “There are heaps of faster, cheaper databases out there, and the only reason to use blockchain over those is if you really require zero trust in a central party.”
What’s worse, beyond not delivering on ROI, the blockchain industry became a backwater for web 3.0 fraud. In 2024 alone, the FBI reported Americans suffered $9.3 billion in losses due to cryptocurrency-related scams.
In the end, “blockchain for everything” collapsed under high friction, low reward, and few real-world use cases.
Lesson learned: Watch out for technologies that offer solutions to problems that don’t exist.
Metaverse
Have you heard of the metaverse? It’s “the next digital revolution” that businesses are racing toward and executives are calling “breakthrough” and “transformational.” It will redefine everything you do, from social interactions to factory floors to business meetings.
At least, that’s what consultancy Accenture, Meta’s Mark Zuckerberg, Microsoft’s Satya Nadella, and plenty of others had us believing at the height of the pandemic’s surreal, chronically online atmosphere.
Given those futuristic claims, you would think we’d all be meeting around holographic conference tables, talking to employee avatars by now. But no fully immersive, transhumanistic workplace reality has arrived.
“Both blockchain and VR and ‘the metaverse’ were heavily hyped and have failed to achieve success commensurate with the amount of money and hype poured into them,” says Honeycomb’s Fong-Jones.
While AR/VR thrived in niche communities, gaming, and certain training scenarios, the idea of mixed reality taking over work life was wildly overstated.
The absence of a “killer” app, zero desire for VR meetings, and high headset costs stalled the metaverse’s momentum from the get-go. Not to mention the conveniently timed rebrand of Meta, the metaverse’s loudest proponent.
Lesson learned: Don’t buy into paradigm shifts with low user enthusiasm and unproven ROI.
Big data
“Big data was one of the most hyped trends of the last decade,” says Shannon Mason, chief strategy officer, Tempo Software, a project and resource management platform. “It promised magic but delivered mess.”
Back in 2011, consultants McKinsey & Company hailed big data as “the next frontier for innovation.” The idea was that by storing all the data a company could get its hands on, you could unearth valuable insights and inform predictive analytics to direct decision-making.
In practice, the reality was far messier. Teams encountered massive storage and data management overheads, and were left unsure how to turn swelling data lakes into something useful.
“Too often, the reality was sprawling expensive data lakes that became data swamps,” says Mason. “Instead of simplifying decisions, they created new layers of complexity: multiple tools, governance headaches, and very little actionable output.”
When working at CA Technologies and with other teams, Mason witnessed enterprises investing heavily in big data programs only to find them underused and difficult to operationalize. “They stood up enormous infrastructures, often taking months to implement, and then struggled to define how the data would actually inform business outcomes,” she says.
While the promise of big data may never have fully materialized, it arguably influenced some enterprises to take their data strategy more seriously. And, hopes are that AI could one day make mining trends in large data stores more feasible. For some, it already has.
Lesson learned: If a large-scale tech initiative can’t show how it drives business value from day one, it’s probably more burden than breakthrough.
SOA
“After years of hype, SOA never really materialized,” says Digital.ai’s Holt.
Service-oriented architecture (SOA) was an idea trumpeted in the early 2000s as a move from monolithic architecture to component-based, loosely-coupled, reusable services—often coordinated through an integration or management layer.
The idea was to improve reusability, interoperability, and scalability across internal systems, ultimately enhancing business agility and time to market. But like many things that sound too good to be true, it was.
As for why SOA faltered, Holt points to heavyweight standards, orchestration and performance issues, lack of reuse, cultural and organizational hurdles, unclear ownership, and governance concerns. Too often, the people and process elements of the transformation came too late.
However, there’s a silver lining here—SOA paved the way for modern cloud-based designs. “The SOA trend did however give way to microservices and API-first architecture, which are still used today,” says Holt.
To his point, REST APIs are ubiquitous. The API economy is now a multibillion-dollar industry projected to expand further with agentic consumption and API-based productization. Nearly 90% of developers use APIs, according to SlashData, with 61% of API usage being for internal services, like microservices, as reported by Postman in 2023.
Lesson learned: Sometimes, it’s what the trend inspires that leaves the everlasting impact.
NFTs
Non-fungible tokens (NFTs)… where to begin? The fact that enough people were convinced a digital image of a bored ape was worth millions should make anyone do a double-take.
As CloudBolt’s Campos puts it, “NFTs took the hype even further, touted as the future of digital ownership but ultimately collapsing without meaningful use cases.”
To their credit, NFTs were a novel idea for digital artists and collectors: a blockchain-based asset to verify ownership and authenticity of a digital work. NFTs were extolled as the next big investment class and even a transformative technology for business. But by 2023, most NFTs had become virtually worthless, The Guardian reported.
Some defenders still tout niche NFT use cases. One web 3.0 proponent, in a sponsored Forbes post in early 2024, pointed to fringe uses outside of art, like an airline that offers NFT versions of its tickets. Necessary? Useful? Debatable.
Beyond such experiments, NFTs failed to demonstrate lasting value in IT. Public perception plummeted as the bubble burst, copycats proliferated, and major crypto exchanges collapsed.
Lesson learned: Technology based 100% on public perception can disappear as quickly as the hype that created it.
Generative AI
“Generative AI is the latest example,” says Mason, who cites the recent MIT study showing 95% of generative AI pilots fail as very telling.
Similarly, a 2025 McKinsey survey found that 80% of companies using generative AI found no significant bottom-line impact, with 90% of projects still stuck in “pilot mode.”
While the numbers don’t sound promising, the AI hype cycle is more nuanced than others. “The problem isn’t the tech, it’s the approach: broad, abstract use cases instead of targeted pain points,” Mason adds. “The future belongs to smaller, focused AI applications that reduce complexity and solve real problems.”
On the consumer side, the “force-feeding of AI on an unwilling public,” as Ted Gioia puts it, has led to increased apathy: only 8% of Americans would pay extra for AI, reports ZDNET. Generative AI features continue to appear in end-user applications, whether they’re helpful or not—and users are pushing back. The Wall Street Journal reports that companies are learning to be far more cautious about promoting AI in products.
Others agree that AI could use a dose of realism. “Lessons from blockchain can definitely be applied to today’s AI frenzy,” says Campos. “Focus on solving real problems, not chasing buzzwords.”
Even so, AI has more staying power than earlier waves. “AI is different because it actually delivers tangibly different results, at a convenience and price point that is much less of an issue,” says Fong-Jones. Although broader business benefits remain elusive, generative AI has been successfully applied in niches such as software development. It’s undoubtedly here to stay.
Holt also sees many parallels from historical hype cycles to today’s focus on AI and agents, underscoring the need for evolving standards, like Model Context Protocol and Agent2Agent. “Much work is still ahead to continue to improve those standards and to explore more complex use cases,” he says.
Lesson learned: Some hyped technologies are praiseworthy, but need maturity and refinement in where exactly to apply them.
The bigger picture
Of course, these six trends aren’t the only hype waves we’ve lived through. Tech is full of other high promises and low failures. “These hype cycles have been around for years,” reminds Sonatype’s Fox. “They’re a constant reminder to stay practical and pragmatic about new technologies without abandoning reasoning.”
It’s hard to know when you’re getting swept up in the bandwagon of tech trends, let alone where the road is heading. Sometimes, the confusion can fog up what works in the current moment.
“The industry is often quick to downplay technology trends of the past as new approaches emerge,” says Holt. “While AI and agents are getting nearly all of the hype today, I have little doubt the many innovations over the past few decades will continue to drive impact at scale.”
Regardless, history repeats itself, and hindsight can help guide future tech choices.
For instance, many of the trends above required a high degree of friction and complexity compared to other “mainstream” technologies of the time, making their end payoffs unclear. “Adding exotic technology without a clear, measurable benefit will only cause more pain than payoff,” says R Systems’ Rao.
For Rao, his organization’s dalliance with blockchain proved that people need incentives and accountability to embrace new technology. It also inspired the company to instigate kill switches for new experiments. “Now, if we don’t see real usage by a set date, we pivot or stop.”
He goes on to note that even some mainstream tech that appears to be “the status quo” is overhyped. “Survivorship bias ensures that only the few success stories are covered,” he says.
Chasing the next big thing
This isn’t to say that all the ideas lampooned above are worthless. Many sparked innovation and will continue to evolve in their own ways. Moreso, the gulf between promise and reality, and the tendency for hype to overheat the market, is very apparent in retrospect.
So, what’s driving tech’s insatiable lust for the next big thing? Human psychology. VC dollars. FOMO. Plain curiosity. Excitement and hype, after all, is what drives invention.
As Holt acknowledges: “Without these motivations, many breakthroughs may have never received the resources, attention, and early adoption required to break through.”
He continues. “From railroads and electricity to the internet and AI, the hype around ‘game-changing technology’ drives us forward.”
So, some hype around ‘the next big thing’ ain’t all that bad. It’s knowing how to tell when wishful thinking replaces sanity that makes all the difference.
Or, as Mason says, “Novelty is not value.”
Back to the future: The most popular JavaScript stories and themes of 2025 2 Jan 2026, 1:00 am
Artificial intelligence and its promise to revolutionize programming—and possibly overthrow human sovereignty—is a central story of the post-Covid world. But for JavaScript developers, it is only one of the forces contending for center stage.
Among the notable trends in 2025 was the emergence of increasing power, importance, and stability on the server side. This was partially driven by the universal expansion of front-end frameworks into full stack ones. It also reflected the maturation of various general-purpose and use-case specific server-side tools.
Alongside this was a move toward simplicity in JavaScript programming, perhaps best typified by tools like HTMX and Hotwire. The elaborate complex of solutions driven by web requirements (UX, performance, state management, etc.) met an articulate counterpoint in new tools that maximize developer experience.
The big question about AI is how it will redefine the applications we build, and not just how we build them. Currently, AI still seems to be just another feature, rather than a radical upsetting of the tech industry apple cart.
Rain or snow, disruption or iteration, JavaScript developers have one of the best seats on the roller coaster ride of software innovation. Here’s a look at some of the most important moments that defined our last year.
7 stories that defined JavaScript development in 2025
If 2025 was a journey, these were the landmarks. Our most popular JavaScript features and tutorials this year tracked with the three major shifts we felt on the ground: Thirst for simplicity, solidification of the server, and the accelerating adoption of AI.
HTMX and Alpine.js: How to combine two great, lean front ends
JavaScript developers started the year looking for a way to simplify without sacrificing power. Many found a solution in combining HTMX with Alpine.js.
ECMAScript 2025: The best new features in JavaScript
The annual JavaScript language update focused on performance and precision. From the lazy evaluation of the new Iterator object to the AI-ready Float16Array, ECMAScript 2025 showed JavaScript is still a living language, evolving to meet modern demands.
10 JavaScript concepts you need to succeed with Node
As JavaScript’s server side evolves toward stability and power, understanding the runtime becomes non-negotiable. This knowledge isn’t just about syntax but the mechanics of the engine. From the nuances of the event loop to the proper handling of streams and buffers, mastering core JavaScript concepts is the difference between writing Node code and architecting scalable, high-performance systems.
Intro to Nitro: The server engine built for modern JavaScript
Do you ever wonder how modern meta-frameworks run effortlessly on everything from a traditional Node server to a Cloudflare Edge Worker? The answer is Nitro. More than just an HTTP server, Nitro is a universal deployment engine. By abstracting away the runtime, Nitro delivers a massive leap in server-side portability, effectively coalescing the fractured landscape of deployment targets into a single, unified interface.
Intro to Nest.js: Server-side JavaScript development on Node
While Nitro handles the runtime, Nest handles the architecture. Nest has emerged as the gold standard for serious, scalable back-end engineering. By moving beyond the “assemble it yourself” mode of Express middleware, and toward a structured development platform, Nest empowers teams to build large-scale apps in JavaScript.
Comparing Angular, React, Vue, and Svelte: What you need to know
The so-called framework wars have gradually evolved to something else. “Framework collaboration” is far less sensational but rings true over the past few years. All the major frameworks (and many less prominent ones) have attained feature parity, mainly by openly influencing and inspiring each other. Choosing a framework is still a meaningful decision, but the difficulty now is in making the best choice among good ones.
Just say no to JavaScript
Lest you think I am a complete JavaScript fanboy, I offer this popular critique by fellow InfoWorld columnist Nick Hodges. Here, he takes aim at JavaScript and sings the praises of TypeScript, while speculating as to why more developers have not yet taken the leap.
Enterprise Spotlight: Setting the 2026 IT agenda 1 Jan 2026, 2:00 am
IT leaders are setting their operations strategies for 2026 with an eye toward agility, flexibility, and tangible business results.
Download the January 2026 issue of the Enterprise Spotlight from the editors of CIO, Computerworld, CSO, InfoWorld, and Network World and learn about the trends and technologies that will drive the IT agenda in the year ahead.
What’s next for Azure containers? 1 Jan 2026, 1:00 am
The second part of Azure CTO Mark Russinovich’s “Azure Innovations” Ignite 2025 presentation covered software and a deeper look at the platforms he expects developers will use to build cloud-native applications.
Azure was born as a platform-as-a-service (PaaS) environment, providing the plumbing for your applications so you didn’t have to think about infrastructure, as it was all automated and hidden by APIs and configured through a web portal. Over the years, things have evolved, and Azure now supports virtual infrastructures and a command line where you can manage applications as well as its own infrastructure-as-code (IaC) development tools and language.
Despite all this, the vision of a serverless Azure has been a key driver for many of its innovations, from its Functions on-demand compute platform, to the massive data environment of Fabric and the hosted scalable orchestration platform that underpins the microservice Azure Container Instances. This vision is key to many of the new tools and services Russinovich talked about, delivering a platform that allows developers to concentrate on code.
That approach doesn’t stop Microsoft from working on new hardware and infrastructure features for Azure; they remain essential for many workloads and are key to supporting the new cloud-native models. It’s important to understand what underlies the abstractions we’re using, as it defines the limits of what we can do with code.
Serverless containers at scale
One of the key serverless technologies in Azure is Azure Container Instances. ACI is perhaps best thought of as a way to get many of the benefits of Kubernetes without having to manage and run a Kubernetes environment. It hosts and manages containers for you, handling scaling and container life cycles. In his infrastructure presentation, Russinovich talked about how new direct virtualization tools made it possible to give ACI-hosted containers access to Azure hardware such as GPUs.
Microsoft is making a big bet on ACI, using it to host many elements of key services across Azure and Microsoft 365. These include Excel’s Python support, the Copilot Actions agents, and Azure’s deployment and automation services, with many more under development or in the middle of migrating to the platform. Russinovich describes it as “the plan of record for Microsoft’s internal infrastructure.”
ACI development isn’t only happening at the infrastructure level. It’s also happening in the orchestration services that manage containers. One key new feature is a tool called NGroups, which lets you define fleets of a standard container image that can be scaled up and burst out as needed. This model supports the service’s rapid scaling standby pools which can be deployed in seconds, applying customization as needed.
With ACI needing to support multitenant operations, there’s a requirement for fair managed resource sharing between containers. Otherwise it would be easy for a hostile container to quickly take all the resources on a server. However, there’s still a need for containers within a subscription to be able to share resources as necessary, a model that Russinovich calls “resource oversubscription.”
This is related to a new feature that builds on the direct virtualization capabilities being added to Azure: stretchable instances. Here you can define the minimum and maximum for CPU and memory and adjust as load changes. Where traditionally containers have scaled out, stretchable instances can also scale up and down within the available headroom on a server.
Improving container networking with managed Cilium
Container networking, another area I’ve touched on in the past, is getting upgrades, with improvements to Azure’s support for eBPF and specifically for the Cilium network observability and security tools. Extended Berkeley Packet Filters let you put probes and rules down into the kernel securely without affecting operations, both in Linux and Windows. It’s a powerful way of managing networking in Kubernetes, where Cilium has become an important component of its security stack.
Until now, even though Azure has had deep eBPF support, you’ve had to bring your own eBPF tools and manage them yourself, which does require expertise to run at scale. Not everyone is a Kubernetes platform engineer, and with tools like AKS providing a managed environment for cloud-native applications, having a managed eBPF environment is an important upgrade. The new Azure Managed Cilium tool provides a quick way of getting that benefit in your applications, using it for host routing and significantly reducing the overhead that comes with iptables-based networking.
You’ll see the biggest improvements in pod-to-pod routing with small message sizes. This shouldn’t be a surprise: the smaller the message, the bigger the routing overhead using iptables. Understanding how this can affect your applications can help you design better messaging, and where small messages get delivered three times faster, it’s worth optimizing applications to take advantage of these performance boosts.
By integrating Cilium with Azure’s AKS, it now becomes the default way to manage container networking on a pod host (38% faster over a bring-your-own install), working as part of the familiar Advanced Container Networking Services tools. On top of that, Microsoft will ensure your Cilium instance is up to date and will provide support that a bring-your-own instance won’t get.
Even though you are unlikely to interact directly with Azure’s hardware, many of the platform innovations Russinovich talks about depend on the infrastructure changes he discussed in a previous Ignite session, especially on things like the network accelerator in Azure Boost.
This underpins upgrades to Azure Container Storage, working with both local NVMe storage and remote storage using Azure’s storage services. One upgrade here is a distributed cache that allows a Kubernetes cluster to share data using local storage rather than download it to every pod every time you need to use it—a problem that’s increasingly an issue for applications that spin up new pods and nodes to handle inferencing. Using the cache, a download that might take minutes is now a local file access that takes seconds.
Securing containers at an OS level
It’s important to remember that Azure (and other hyperscalers) isn’t in the business of giving users their own servers; its model uses virtual machines and multiple tenants to get as much use out of its hardware as possible. That approach demands a deep focus on security, hardening images and using isolation to separate virtual infrastructures. In the serverless container world, especially with the new direct virtualization features, we need to lock down even more than in a VM, as our ACI-hosted containers are now sharing the same host OS.
Declarative policies let Azure lock down container features to reduce the risk of compromised container images affecting other users. At the same time, it’s working to secure the underlying host OS, which for ACI is Linux. SELinux allows Microsoft to lock that image down, providing an immutable host OS. However, those SELinux policies don’t cross the boundary into containers, leaving their userspace vulnerable.
Microsoft has been adding new capabilities to Linux that can verify the code running in a container. This new feature, Integrity Policy Enforcement, is now part of what Microsoft calls OS Guard, along with another new feature: dm-verity. Device-mapper-verity is a way to provide a distributed hash of the containers in a registry and the layers that go into composing a container, from the OS image all the way up to your binaries. This allows you to sign all the components of a container and use OS Guard to block containers that aren’t signed and trusted.
Delivering secure hot patches
Having a policy-driven approach to security helps quickly remediate issues. If, say, a common container layer has a vulnerability, you can build and verify a patch layer and deploy it quickly. There’s no need to patch everything in the container, only the relevant components. Microsoft has been doing this for OS features for some time now as part of its internal Project Copacetic, and it’s extending the process to common runtimes and libraries, building patches with updated packages for tools like Python.
As this approach is open source, Microsoft is working to upstream dm-verity into the Linux kernel. You can think of it as a way to deploy hot fixes to containers between building new immutable images, quickly replacing problematic code and keeping your applications running while you build, test, and verify your next release. Russinovich describes it as rolling out “a hot fix in a few hours instead of days.”
Providing the tools needed to secure application delivery is only part of Microsoft’s move to defining containers as the standard package for Azure applications. Providing better ways to scale fleets of containers is another key requirement, as is improved networking. Russinovich’s focus on containers makes sense, as they allow you to wrap all the required components of a service and securely run it at scale.
With new software services building on improvements to Azure’s infrastructure, it’s clear that both sides of the Azure platform are working together to deliver the big picture, one where we write code, package it, and (beyond some basic policy-driven configuration) let Azure do the rest of the work for us. This isn’t something Microsoft will deliver overnight, but it’s a future that’s well on its way—one we need to get ready to use.
Critical vulnerability in IBM API Connect could allow authentication bypass 31 Dec 2025, 5:49 pm
IBM is urging customers to quickly patch a critical vulnerability in its API Connect platform that could allow remote attackers to bypass authentication.
The company describes API Connect as a full lifecycle application programming interface (API) gateway used “to create, test, manage, secure, analyze, and socialize APIs.”
It particularly touts it as a way to “unlock the potential of agentic AI” by providing a central point of control for access to AI services via APIs. The platform also includes API Agent, which automates tasks across the API lifecycle using AI.
A key component is a customizable self-service portal that allows developers to easily onboard themselves, and to discover and consume multiple types of API, including SOAP, REST, events, ASyncAPIs, GraphQL, and others.
The flaw, tracked as CVE-2025-13915, affects IBM API Connect versions 10.0.8.0 through 10.0.8.5, and version 10.0.11.0, and could give unauthorized access to the exposed applications, with no user interaction required.
An architectural assumption is broken
“CVE-2025-13915 is not best understood as a security bug,” said Sanchit Vir Gogia, chief analyst at Greyhound Research. “It is better understood as a moment where a long standing architectural assumption finally breaks in the open. The assumption is simple and deeply embedded in enterprise design: If traffic passes through the API gateway, identity has been enforced and trust has been established. This vulnerability proves that assumption can fail completely.”
He noted that the classification of the weakness, which maps to CWE-305, is important because it rules out a whole class of what he called comforting explanations. “This is not stolen credentials. It is not role misconfiguration. It is not a permissions mistake,” he said. “The authentication enforcement itself can be circumvented.”
When that happens, he explained, downstream services do not simply face elevated risk, they lose the foundation on which their access decisions were built because they do not revalidate identity. They were never designed to; they inherit trust.
“Once enforcement fails upstream, inherited trust becomes unearned trust, and the exposure propagates silently,” he said. “This class of vulnerability aligns with automation, broad scanning, and opportunistic probing rather than careful targeting.”
Interim fixes provided
IBM said that the issue was discovered during internal testing, and it has provided interim fixes for each affected version of the software, with individual update details for VMware, OCP/CP4I, and Kubernetes.
The only mitigation suggested for the flaw, according to IBM’s security bulletin, is this: “Customers unable to install the interim fix should disable self-service sign-up on their Developer Portal if enabled, which will help minimize their exposure to this vulnerability.”
The company also notes in its installation instructions for the fixes that the image overrides described in the document must be removed when upgrading to the next release or fixpack.
This, said Gogia, further elevates the risk. “That is not a cosmetic detail,” he noted. “Management planes define configuration truth, lifecycle control, and operational authority across the platform. When remediation touches this layer, the vulnerability sits close to the control core, not at an isolated gateway edge. That raises both blast radius and remediation risk.”
This is because errors in these areas can turn into prolonged exposure or service instability. “[Image overrides] also introduce a governance hazard: Image overrides create shadow state; if they are not explicitly removed later, they persist quietly,” he pointed out. “Over time, they drift out of visibility, ownership, and audit scope. This is how temporary fixes turn into long term risk.”
Most valuable outcome: Learning
He added that the operational challenges involved in remediation are not so much in knowing what has to be done, but in doing it fast enough without breaking the business. And, he said, API governance now needs to include up to date inventories of APIs, their versions, dependencies, and exposure points, as well as monitoring of behavior.
“The most valuable outcome here is not closure,” Gogia observed. “It is learning. Enterprises should ask what would have happened if this flaw had been exploited quietly for weeks. Which services would have trusted the gateway implicitly? Which logs would have shown abnormal behavior? Which teams would have noticed first? Those answers reveal whether trust assumptions are visible or invisible. Organizations that stop at patching will miss a rare opportunity to strengthen resilience before the next control plane failure arrives.”
What is cloud computing? From infrastructure to autonomous, agentic-driven ecosystems 31 Dec 2025, 4:21 am
Cloud computing continues to be the platform of choice for large applications and a driver of innovation in enterprise technology. Gartner forecasts public cloud spending alone to the public cloud services market alone will reach $1.42 trillion in current U.S. dollars, driven by AI workloads and enterprise modernization.
Driving this growth are the rise of AI and machine learning on the cloud, adoption of edge computing, the maturation of serverless computing, the emergence of multicloud strategies, improved security and privacy, and more sustainable cloud practices.
What is cloud computing?
While often used broadly, the term cloud computing is defined as an abstraction of compute, storage, and network infrastructure assembled as a platform on which applications and systems are deployed quickly and scaled on the fly.
Most cloud customers consume public cloud computing services over the internet, which are hosted in large, remote data centers maintained by cloud providers. The most common type of cloud computing, SaaS (software as service), delivers prebuilt applications to the browsers of customers who pay per seat or by usage, exemplified by such popular apps as Salesforce, Google Docs, or Microsoft Teams.
5 top trends in cloud computing
- Agentic cloud ecosystems: The shift from AI as a tool to AI as an autonomous operator within cloud environments.
- Sovereign and localized clouds: Meeting strict national data residency and digital sovereignty laws.
- Specialized AI hardware access: Navigating the GPU capacity crunch through reserved instances and boutique AI clouds.
- Integrated greenOps: Merging cost optimization with mandatory carbon-footprint reporting.
- Industry-specific walled gardens: The maturation of vertical clouds into highly regulated, precompliant environments for finance and healthcare.
Next in line is IaaS (infrastructure as a service), which offers vast, virtualized compute, storage, and network infrastructure upon which customers build their own applications, often with the aid of providers’ API-accessible services.
When people refer to the “the cloud” today, they most often mean the big IaaS providers: AWS (Amazon Web Services), Google Cloud Platform, or Microsoft Azure. All three have become ecosystems of services that go way beyond infrastructure and include developer tools, serverless computing, machine learning services and APIs, data warehouses, and thousands of other services. With both SaaS and IaaS, a key benefit is agility. Customers gain new capabilities almost instantly without the capital investment in hardware or software on-premises — and they can instantly scale the cloud resources they consume up or down as needed.
According to Foundry’s Cloud Computing Study, 2025, enterprises are moving to the cloud to improve security and/or governance, increase scalability, accelerate adoption of artificial intelligence and machine learning and other new technologies, replace on-premises legacy technology, improve employee productivity, and ensure disaster recovery and business continuity.
Hyperscalers now dominate cloud services
The largest cloud service providers are often described as hyperscalers, due to their capability to provide large-scale data centers across the globe. Hyperscalers typically offer a wide range of cloud services, including IaaS, PaaS, SaaS, and more.
As mentioned above, notable hyperscalers include Amazon Web Services (AWS), Google Cloud Platform, and Microsoft Azure. They offer the following capabilities.
- Scalability: Hyperscalers can handle massive workloads and scale resources up or down quickly.
- Cost-effectiveness: Hyperscalers often offer competitive pricing and economies of scale.
- Global reach: Hyperscalers operate data centers around the world, providing low-latency access to customers in different regions.
- Innovation: Hyperscalers are at the forefront of cloud innovation, offering new services and features.
Challenges of working with hyperscalers
- Vendor lock-in: Relying heavily on a single hyperscaler can create vendor lock-in, making it difficult to switch to another provider and charging large egress fees if you do move.
- Complexity: Hyperscalers offer a vast array of services, which can be overwhelming for some customers.
- Security concerns: Because hyperscalers handle sensitive data, security is a major concern.
AI, Agents, and the Sovereign Cloud
The AI-enabled enterprise has moved beyond simple chatbots. The focus has shifted to agentic workflows — autonomous systems that reside in the cloud and possess the authority to execute business processes, manage cloud spend, and self-patch security vulnerabilities without human intervention.
The shift to agentic infrastructure
Cloud providers are no longer just selling compute. They are selling inference-as-a-service. Modern cloud budgets are now dominated by the high cost of specialized GPU clusters (such as Nvidia’s Blackwell architecture). This has led to the rise of boutique AI clouds that compete with hyperscalers by offering bare-metal access to the latest silicon specifically for model training and fine-tuning.
Data sovereignty and private AI
A major shift in late 2025 is the move away from public AI models for sensitive data. Organizations are increasingly using retrieval-augmented generation (RAG) within walled garden environments. This ensures that a company’s proprietary data never leaves their specific cloud instance to train a provider’s base model.
Furthermore, sovereign AI has become a requirement for global operations. Governments now demand that the AI models processing their citizens’ data be hosted on infrastructure that is owned, operated, and governed within their own borders.
The challenges of ghost AI
Just as shadow IT plagued the 2010s, ghost AI—unauthorized AI agents running on corporate cloud accounts — has become a primary security risk. Managing these autonomous entities requires a new layer of AI governance, where the cloud provider automatically audits the intent and permissions of every running agent to prevent runaway costs or data leaks.
Cloud computing definitions
In 2011, NIST posted a PDF that divided cloud computing into three “service models” — SaaS, IaaS, and PaaS (platform as a service) — the latter being a controlled environment within which customers develop and run applications. These three categories have largely stood the test of time, although most PaaS solutions now are made available as services within IaaS ecosystems rather than as dedicated PaaS clouds.
Two evolutionary trends stand out since NIST’s threefold definition. One is the long and growing list of subcategories within SaaS, IaaS, and PaaS, some of which blur the lines between categories. The other is the explosion of API-accessible services available in the cloud, particularly within IaaS ecosystems. The cloud has become a crucible of innovation where many emerging technologies appear first as services, a big attraction for business customers who understand the potential competitive advantages of early adoption.
SaaS (software as a service) definition
This type of cloud computing delivers applications over the internet, typically with a browser-based user interface. Today, most software companies offer their wares via SaaS — if not exclusively, then at least as an option.
The most popular SaaS applications for business are Google’s G Suite and Microsoft’s Office 365. Most enterprise applications, including giant ERP suites from Oracle and SAP, come in both SaaS and on-premises versions. SaaS applications typically offer extensive configuration options as well as development environments that enable customers to code their own modifications and additions. They also enable data integration with on-prem applications.
IaaS (infrastructure as a service) definition
At a basic level, IaaS cloud providers offer virtualized compute, storage, and networking over the internet on a pay-per-use basis. Think of it as a data center maintained by someone else, remotely, but with a software layer that virtualizes all those resources and automates customers’ ability to allocate them with little trouble.
But that’s just the basics. The full array of services offered by the major public IaaS providers is staggering: highly scalable databases, virtual private networks, big data analytics, AI and machine learning services, application platforms, developer tools, devops tools, and so on. Amazon Web Services was the first IaaS provider and remains the leader, followed by Microsoft Azure, Google Cloud Platform, IBM Cloud, and Oracle Cloud.
PaaS (platform as a service) definition
PaaS provides sets of services and workflows that specifically target developers, who can use shared tools, processes, and APIs to accelerate the development, testing, and deployment of applications. Salesforce’s Heroku and Salesforce Platform (formerly Force.com) are popular public cloud PaaS offerings; Cloud Foundry and Red Hat’s OpenShift can be deployed on premises or accessed through the major public clouds. For enterprises, PaaS can ensure that developers have ready access to resources, follow certain processes, and use only a specific array of services, while operators maintain the underlying infrastructure.
FaaS (function as a service) definition
FaaS, the original and most basic version of serverless computing, adds another layer of abstraction to PaaS, so that developers are insulated from everything in the stack below their code. Instead of futzing with virtual servers, containers, and application runtimes, developers upload narrowly functional blocks of code, and set them to be triggered by a certain event (such as a form submission or uploaded file). All of the major clouds offer FaaS on top of IaaS: AWS Lambda, Azure Functions, Google Cloud Functions, and IBM Cloud Functions. A special benefit of FaaS applications is that they consume no IaaS resources until an event occurs, reducing pay-per-use fees.
Private cloud definition
A private cloud downsizes the technologies used to run IaaS public clouds into software that can be deployed and operated in a customer’s data center. As with a public cloud, internal customers can provision their own virtual resources to build, test, and run applications, with metering to charge back departments for resource consumption. For administrators, the private cloud amounts to the ultimate in data center automation, minimizing manual provisioning and management.
VMware remains a force in the private cloud software market, but the acquisition by Broadcom has created confusion and raised concerns among some customers about potential changes in pricing, licensing, and support. This could lead some organizations to explore alternative solutions.
OpenStack continues to be a popular open-source choice for building private clouds. It offers a flexible and customizable platform that can be tailored to specific needs. However, OpenStack can be complex to deploy and manage, and it may require significant expertise to maintain.
Kubernetes, a container orchestration platform that has gained significant traction in recent years, is often used in conjunction with other technologies like OpenStack to build cloud-native applications. Red Hat OpenShift is a comprehensive cloud platform based on Kubernetes that provides a managed experience for deploying and managing container-based, applications.
Many cloud providers offer their own cloud-native platforms and tools, such as AWS Outposts, Azure Stack, and Google Cloud Anthos.
Common factors to consider when evaluating private cloud platforms include the following:
- Pricing: The initial cost of deployment and ongoing maintenance costs.
- Complexity: The level of technical expertise needed to manage the platform.
- Flexibility: The ability to customize the platform to meet specific needs.
- Vendor lock-in: The degree to which the organization is tied to a particular vendor.
- Security: The security features and capabilities of the platform.
- Scalability: The capability to expand the platform to meet future needs.
Hybrid cloud definition
A hybrid cloud is the integration of a private cloud with a public cloud. At its most developed, the hybrid cloud involves creating parallel environments in which applications can move easily between private and public clouds. In other instances, databases may stay in the customer data center and integrate with public cloud applications — or virtualized data center workloads may be replicated to the cloud during times of peak demand. The types of integrations between private and public clouds vary widely, but they must be extensive to earn a hybrid cloud designation.
Public APIs (application programming interfaces) definition
Just as SaaS delivers applications to users over the internet, public APIs offer developers application functionality that can be accessed programmatically. For example, in building web applications, developers often tap into the Google Maps API to provide driving directions; to integrate with social media, developers may call upon APIs maintained by Twitter, Facebook, or LinkedIn. Twilio has built a successful business delivering telephony and messaging services via public APIs. Ultimately, any business can provision its own public APIs to enable customers to consume data or access application functionality.
iPaaS (integration platform as a service) definition
Data integration is a key issue for any sizeable company, but particularly for those that adopt SaaS at scale. iPaaS providers typically offer prebuilt connectors for sharing data among popular SaaS applications and on-premises enterprise applications, though providers may focus more or less on business-to-business and e-commerce integrations, cloud integrations, or traditional SOA-style integrations. iPaaS offerings in the cloud from such providers as Dell Boomi, Informatica, MuleSoft, and SnapLogic also let users implement data mapping, transformations, and workflows as part of the integration-building process.
IDaaS (identity as a service) definition
The most difficult security issue related to cloud computing is managing user identity and its associated rights and permissions across data centers and pubic cloud sites. IDaaS providers maintain cloud-based user profiles that authenticate users and enable access to resources or applications based on security policies, user groups, and individual privileges. The ability to integrate with various directory services (Active Directory, LDAP, etc.) and provide single sign-on across business-oriented SaaS applications is essential.
Leaders in IDaaS include Microsoft, IBM, Google, Oracle, Okta, Capgemini, Okta, Junio Corporation, OneLogin, and JumpCloud.
Collaboration platforms
Collaboration solutions such as Slack and Microsoft Teams have become vital messaging platforms that enable groups to communicate and work together effectively. Basically, these solutions are relatively simple SaaS applications that support chat-style messaging along with file sharing and audio or video communication. Most offer APIs to facilitate integrations with other systems and enable third-party developers to create and share add-ins that augment functionality.
Vertical clouds
Key providers in such industries as financial services, healthcare, retail, life sciences, and manufacturing provide PaaS clouds to enable customers to build vertical applications that tap into industry-specific, API-accessible services. Vertical clouds can dramatically reduce the time to market for vertical applications and accelerate domain-specific B2B integrations. Most vertical clouds are built with the intent of nurturing partner ecosystems.
Other cloud computing considerations
The most widely accepted definition of cloud computing means that you run your workloads on someone else’s servers, but this is not the same as outsourcing. Virtual cloud resources and even SaaS applications must be configured and maintained by the customer. Consider these factors when planning a cloud initiative.
Cloud computing security considerations
Objections to the public cloud generally begin with cloud security, although the major public clouds have proven themselves much less susceptible to attack than the average enterprise data center.
Of greater concern is the integration of security policy and identity management between customers and public cloud providers. In addition, government regulation may forbid customers from allowing sensitive data off-premises. Other concerns include the risk of outages and the long-term operational costs of public cloud services.
Multicloud management considerations
To enhance their operational efficiency, reduce costs, and improve security, many companies are increasingly turning to multicloud strategies. By distributing workloads across multiple cloud providers, organizations can avoid vendor lock-in, optimize costs, and leverage the best-of-breed services offered by different providers.
This multicloud approach also improves performance and reliability by minimizing downtime and optimizing latency. Additionally, multicloud strategies strengthen security by diversifying the attack surface and facilitating compliance with industry regulations. Finally, by replicating critical workloads across multiple regions and providers, companies can establish robust disaster recovery and business continuity plans, ensuring minimal disruption in the event of catastrophic failures.
The bar to qualify as a multicloud adopter is low: A customer just needs to use more than one public cloud service. However, depending on the number and variety of cloud services involved, managing multiple clouds can become complex from both a cost optimization and a technology perspective.
In some cases, customers subscribe to multiple cloud services simply to avoid dependence on a single provider. A more sophisticated approach is to select public clouds based on the unique services they offer and, in some cases, integrate them. For example, developers might want to use Google’s Vertex AI Studio on Google Cloud Platform to build AI-driven applications, but prefer Jenkins hosted on the CloudBees platform for continuous integration.
To control costs and reduce management overhead, some customers opt for cloud management platforms (CMPs) and/or cloud service brokers (CSBs), which let you manage multiple clouds as if they were one cloud. The problem is that these solutions tend to limit customers to such common-denominator services as storage and compute, ignoring the panoply of services that make each cloud unique.
Edge computing considerations
You often see edge computing incorrectly described as an alternative to cloud computing. Edge computing is about moving compute to local devices in a highly distributed system, typically as a layer around a cloud computing core. There is typically a cloud involved to orchestrate all of the devices and take in their data, then analyze it or otherwise act on it.
To the cloud and back – why repatriation is real
While public cloud offers scalability and flexibility, some enterprises are opting to return to on-premises infrastructure due to rising costs, data security concerns, performance issues, vendor lock-in, and regulatory compliance challenges. While the public cloud offers scalability and flexibility, on-premises infrastructure provides greater control, customization, and potential cost savings in certain scenarios leading some technology decision-makers to consider repatriation. However, a hybrid cloud approach, combining public and private cloud, often offers the best balance of benefits.
More specific reasons to repatriate including the following:
- Unanticipated costs, such as data transfer fees, storage charges, and egress fees, can quickly escalate, especially for large-scale cloud deployments.
- Inaccurate resource provisioning or underutilization can lead to higher-than-expected costs.
- Stricter data privacy regulations require organizations to store and process data within specific geographic boundaries.
- For highly sensitive data, companies may prefer to maintain greater control over security measures and access permissions.
- On-premises infrastructure can offer lower latency, particularly for applications requiring real-time processing or high-performance computing.
- Overreliance on a single cloud provider can limit flexibility and increase costs. Repatriation allows organizations to diversify their infrastructure and reduce vendor dependency.
- Industries with stringent compliance requirements may find it easier to meet standards with on-premises infrastructure.
- On-premises environments offer greater control over hardware, software, and network configurations, allowing for customized solutions.
Benefits of cloud computing
The cloud’s main appeal is to reduce the time to market of applications that need to scale dynamically. Increasingly, however, developers are drawn to the cloud by the abundance of advanced new services that can be incorporated into applications, from machine learning to internet of things (IoT) connectivity.
Although businesses sometimes migrate legacy applications to the cloud to reduce data center resource requirements, the real benefits accrue to new applications that take advantage of cloud services and “cloud native” attributes. The latter include microservices architecture, Linux containers to enhance application portability, and container management solutions such as Kubernetes that orchestrate container-based services. Cloud-native approaches and solutions can be part of either public or private clouds and help enable highly efficient devops workflows.
Cloud computing, be it public or private or hybrid or multicloud, has become the platform of choice for large applications, particularly customer-facing ones that need to change frequently or scale dynamically. More significantly, the major public clouds now lead the way in enterprise technology development, debuting new advances before they appear anywhere else. Workload by workload, enterprises are opting for the cloud, where an endless parade of exciting new technologies invite innovative use.
SaaS has its roots in the ASP (application service provider) trend of the early 2000s, when providers would run applications for business customers in the provider’s data center, with dedicated instances for each customer. The ASP model was a spectacular failure because it quickly became impossible for providers to maintain so many separate instances, particularly as customers demanded customizations and updates.
Salesforce is widely considered the first company to launch a highly successful SaaS application using multitenancy — a defining characteristic of the SaaS model. Rather than each Salesforce customer getting its own application instance, customers who subscribe to the company’s salesforce automation software share a single, large, dynamically scaled instance of an application (like tenants sharing an apartment building), while storing their data in separate, secure repositories on the SaaS provider’s servers. Fixes can be rolled out behind the scenes with zero downtime and customers can receive UX or functionality improvements as they become available.
Intro to Hotwire: HTML over the wire 31 Dec 2025, 1:00 am
If you’ve been watching the JavaScript landscape for a while, you’ve likely noticed the trend toward simplicity in web application development. An aspect of this trend is leveraging HTML, REST, and HATEOAS (hypermedia as the engine of application state) to do as much work as possible. In this article, we’ll look at Hotwire, a collection of tools for building single-page-style applications using HTML over the wire.
Hotwire is a creative take on front-end web development. It’s also quite popular, with more than 33,000 stars on GitHub and 493,000 weekly NPM downloads as of this writing.
Hotwire: An alternative to HTMX
Hotwire is built on similar principles to HTMX and offers an alternative approach to using HTML to drive the web. Both projects strive to eliminate boilerplate JavaScript and let developers do more with simple markup. Both embrace HATEOAS and the original form of REST. The central insight here is that application markup can contain both the state (or data) and the structure of how data is to be displayed. This makes it possible to sidestep the unnecessary logistics of marshaling JSON at both ends.
This concept isn’t new—in fact, it is the essence of representational state transfer (REST). Instead of converting to a special data format (JSON) on the server, then sending that over to the client where it is converted for the UI (HTML), you can just have the server send the HTML.
Technologies like HTMX and Hotwire streamline the process, making it palatable for developers and users who are acclimated to the endless micro-interactions spawned by Ajax.
Hotwire has three primary JavaScript components, but we are mainly interested in the first two:
- Turbo: Allows for fine-grained control of page updates.
- Stimulus: A concise library for client-side interactivity.
- Native: A library for creating iOS- and Android-native apps from Turbo and Stimulus.
In this article, we will look at Turbo and Stimulus. Turbo has several components that make interactivity with HTML more powerful:
- Turbo Drive avoids full page reloads for links and form submits.
- Turbo Frame lets you define areas of the UI that can be loaded independently (including lazy loading).
- Turbo Streams allows for arbitrarily updating specific page segments (using WebSockets, server-side events, or a form response).
Turbo Drive: Merging pages, not loading pages
In standard HTML, when you load a page, it completely obliterates the existing content and paints all the content anew as it arrives from the server. This is incredibly inefficient and makes for a bad user experience. Turbo Drive takes a different approach by dropping in a JavaScript link, which merges the page contents instead of reloading them.
Think of merging like diffing the current page with the coming page. The header information is updated rather than being wholesale replaced. Modern Turbo even “morphs” the and elements, providing a much smoother transition. (For obvious reasons, this approach is especially effective for page reloads.)
All you have to do is include the turbo script in your page:
It is also important to point out that browser actions like back, forward, and reload all work normally. Merging is a low-cost, low-risk way of improving page navigation and reloads in web pages.
Turbo Frames: Granular UI development
The basic idea in Frames is to decompose the layout of a web page into elements. You then update these frames piecemeal, and only as needed. The overall effect is like using JSON responses to drive reactive updates to the UI, but in this case we are using HTML fragments.
Take this page as an example:
Links that change the entire page
Thimbleberry (Rubus parviflorus)
A delicate, native berry with large, soft leaves.
Edit this description
Found a large patch by the creek.
The berries are very fragile.
...
Here we have a top navigation pane with links that will affect the entire page (useable with Turbo Drive). Then there are two interior elements that can be modified in place, without the entire page reload.
The elements capture events within them. So, when you click the link to edit the field notes, the server can respond with a chunk to provide an editable form:
Field Notes
Found a large patch by the creek.
The berries are very fragile.
This chunk would be rendered as a live form. The user can make updates and submit the new data, and the server would reply with a new fragment containing the updated frame:
Field Notes
Found a large patch by the creek.
The berries are very fragile.
Just saw a bear!
Turbo takes the ID on the arriving frame content and ensures it replaces the same frame on the page (so it is essential that the server puts the correct ID on the fragments it sends). Turbo is smart enough to extract and place only the relevant fragment, even if an entire page is received from the server.
Turbo Streams: Compound updates
Turbo Drive is a simple and effective mechanism for handling basic server interactions. Sometimes, we need more powerful updates that interact with multiple portions of the page, or that are triggered from the server side. For that, Turbo has Streams.
The basic idea is that the server sends a stream of fragments, each with the ID of the part of the UI that will change, along with the content needed for the change. For example, we might have a stream of updates for our wilderness log:
Just saw a Fox!
4 Notes
Here, we are using streams instead of frames to handle the notes update. The idea is that each section that needs updating, like the new note, the note counter, and the live form section receive their content as a stream item. Notice the stream items each has an “action” and a “target” to describe what will happen.
Streams can target multiple elements by using the targets (notice the plural here) and a CSS selector to identify the elements that will be affected.
Turbo will automatically handle responses from the server (like for a form response) that contain a collection of elements, placing them correctly into the UI. This will handle many multi-change requirements. Notice also that in this case, when you are using streams, you don’t need to use a . In fact, mixing the two is not recommended. As a rule of thumb, you should use frames for simplicity whenever you can, and upgrade to streams (and dispense with frames) only when you need to.
Reusability
A key benefit to both Turbo Frames and Turbo Streams is being able to reuse the server-side templates that render UI elements both initially and for updates. You simply decompose your server-side template (like RoR templates or Thymeleaf or Kotlin DSL or Pug—whatever tool you are using) into the same chunks the UI needs. Then you can just use them to render both the initial and ongoing states of those chunks.
For example, here’s a simple Pug template that could be used as part of the whole page or to generate update chunks:
turbo-frame#field_notes
h2 Field Notes
//- 1. The List: Iterates over the 'notes' array
div#notes_list
each note in notes
div(id=`note_${note.id}`)= note.content
//-
2. The Form: On submission, this fragment is re-rendered
- by the server, which includes a fresh, empty form.
form(action="/berries/thimbleberry/notes", method="post")
div
label(for="note_content") Add a new note:
div
//- We just need the 'name' attribute for the server
textarea(id="note_content", name="content")
div
input(type="submit", value="Save note")
Server push
It’s also possible to provide background streams of events using the element:
This element automatically connects to a back-end API for SSE or WebSocket updates. These broadcast updates would have the same structure as before:
Which will automatically connect to a back end API for SSE or WebSocket updates. These broadcast updates would have the same structure as before:
Also found Salmonberries here!
Client-side magic with Stimulus
HTMX is sometimes paired with Alpine.js, with the latter giving you fancier front-end interactivity like accordions, drag-and-drop functionality, and so forth. In Hotwire, Stimulus serves the same purpose.
In Stimulus, you use HTML attributes to connect elements to “controllers,” which are chunks of JavaScript functionality. For example, if we wanted to provide a clipboard copy button, we could do something like this:
Thimbleberry (Rubus parviflorus)
A delicate, native berry with large, soft leaves.
Notice the data-contoller attribute. That links the element to the clipboard controller. Stimulus uses a filename convention, and in this case, the file would be: clipboard_controller.js, with contents something like this:
import { Controller } from "@hotwired/stimulus"
export default class extends Controller {
// Connects to data-clipboard-target="source"
// and data-clipboard-target="feedback"
static targets = [ "source", "feedback" ]
// Runs when data-action="click->clipboard#copy" is triggered
copy() {
// 1. Get text from the "source" target
const textToCopy = this.sourceTarget.textContent
// 2. Use the browser's clipboard API
navigator.clipboard.writeText(textToCopy)
// 3. Update the "feedback" target to tell the user
this.feedbackTarget.textContent = "Copied!"
// 4. (Optional) Reset the button after 2 seconds
setTimeout(() => {
this.feedbackTarget.textContent = "Copy Name"
}, 2000)
}
}
The static target member provides those elements to the controller to work with, based on the data-clipboard-target attribute in the markup. The controller then uses simple JavaScript to perform the clipboard copy and a timed message to the UI.
The basic idea is you keep your JavaScript nicely isolated in small controllers that are linked into the markup as needed. This lets you do whatever extra client-side magic to enhance the server-side work in a manageable way.
Conclusion
The beauty of Hotwire is in doing most of what you need with a very small footprint. It does 80% of the work with 20% of the effort. Hotwire doesn’t have the extravagant power of a full-blown framework like React or a full-stack option like Next, but it gives you most of what you’ll need for most development scenarios. Hotwire also works with any back end with typical technologies.
Nvidia licenses Groq’s inferencing chip tech and hires its leaders 30 Dec 2025, 7:22 am
Nvidia has licensed intellectual property from inferencing chip designer Groq, and hired away some of its senior executives, but stopped short of an outright acquisition.
“We’ve taken a non-exclusive license to Groq’s IP and have hired engineering talent from Groq’s team to join us in our mission to provide world-leading accelerated computing technology,” an Nvidia spokesman said Tuesday, via email. But, he said, “We haven’t acquired Groq.”
Groq designs and sells chips optimized for AI inferencing. These chips, which Groq calls language processing units (LPUs), are lower-powered, lower-priced devices than the GPUs Nvidia designs and sells, which these days are primarily used for training AI models. As the AI market matures, and usage shifts from the creation of AI tools to their use, demand for devices optimized for inferencing is likely to grow.
The company also rents out its chips, operating an inferencing-as-a-service business called GroqCloud.
Groq itself announced the deal and the executive moves on Dec. 24, saying “it has entered into a non-exclusive licensing agreement with Nvidia for Groq’s inference technology” and that, as part fo the agreement, “Jonathan Ross, Groq’s Founder, Sunny Madra, Groq’s President, and other members of the Groq team will join Nvidia to help advance and scale the licensed technology.”
The deal could be worth as much as $20 billion, TechCrunch reported.
A way out of the memory squeeze?
There’s tension throughout the supply chain for chips used for AI applications, leading to Nvidia’s CFO reporting in its last earnings call that some of its chips are “sold out” or “fully utilized.” One of the factors contributing to this identified by analysts is a shortage of high-bandwidth memory. Finding ways to make their AI operations less dependent on scarce memory chips is becoming a key objective for AI vendors and enterprise buyers alike.
A significant difference between Groq’s chip designs and Nvidia’s is the type of memory each uses. Nvidia’s fastest chips are designed to work with high-bandwidth memory, the price of which – like that of other fast memory technologies — is soaring due to limited production capacity and rising demand in AI-related applications. Groq, meanwhile, integrates static RAM into its chip designs. It says SRAM is faster and less power-hungry than the dynamic RAM used by competing chip technologies — and another advantage is that it’s not (yet) as scarce as the high-bandwidth memory or DDR5 DRAM used elsewhere. Licensing Groq’s technology opens the way for Nvidia to diversify its memory sourcing.
Not an acquisition
By structuring its relationship with Groq as an IP licensing deal, and hiring the engineers it is most interested in rather than buying their employer, Nvidia avoids taking on the GroqCloud service business just as it is reportedly stepping back from its own service business, DGX cloud, and restructuring it as an internal engineering service. It could also escape much of the antitrust scrutiny that would have accompanied a full-on acquisition.
Nvidia did not respond to questions about the names and roles of the former Groq executives it has hired.
However, Groq’s founder, Jonathan Ross, reports on his LinkedIn profile that he is now chief software architect at Nvidia, while that of Groq’s former president, Sunny Madra, says he is now Nvidia’s VP of hardware.
What’s left of Groq will be run by Simon Edwards, formerly CFO at sales automation software vendor Conga. He joined Groq as CFO just three months ago.
This article first appeared on Network World.
How to build RAG at scale 30 Dec 2025, 1:00 am
Retrieval-augmented generation (RAG) has quickly become the enterprise default for grounding generative AI in internal knowledge. It promises less hallucination, more accuracy, and a way to unlock value from decades of documents, policies, tickets, and institutional memory. Yet while nearly every enterprise can build a proof of concept, very few can run RAG reliably in production.
This gap has nothing to do with model quality. It is a systems architecture problem. RAG breaks at scale because organizations treat it like a feature of large language models (LLMs) rather than a platform discipline. The real challenges emerge not in prompting or model selection, but in ingestion, retrieval optimization, metadata management, versioning, indexing, evaluation, and long-term governance. Knowledge is messy, constantly changing, and often contradictory. Without architectural rigor, RAG becomes brittle, inconsistent, and expensive.
RAG at scale demands treating knowledge as a living system
Prototype RAG pipelines are deceptively simple: embed documents, store them in a vector database, retrieve top-k results, and pass them to an LLM. This works until the first moment the system encounters real enterprise behavior: new versions of policies, stale documents that remain indexed for months, conflicting data in multiple repositories, and knowledge scattered across wikis, PDFs, spreadsheets, APIs, ticketing systems, and Slack threads.
When organizations scale RAG, ingestion becomes the foundation. Documents must be normalized, cleaned, and chunked with consistent heuristics. They must be version-controlled and assigned metadata that reflects their source, freshness, purpose, and authority. Failure at this layer is the root cause of most hallucinations. Models generate confidently incorrect answers because the retrieval layer returns ambiguous or outdated knowledge.
Knowledge, unlike code, does not naturally converge. It drifts, forks, and accumulates inconsistencies. RAG makes this drift visible and forces enterprises to modernize knowledge architecture in a way they’ve ignored for decades.
Retrieval optimization is where RAG succeeds or fails
Most organizations assume that once documents are embedded, retrieval “just works.” Retrieval quality determines RAG quality far more than the LLM does. As vector stores scale to millions of embeddings, similarity search becomes noisy, imprecise, and slow. Many retrieved chunks are thematically similar but semantically irrelevant.
The solution is not more embeddings; it is a better retrieval strategy. Large-scale RAG requires hybrid search that blends semantic vectors with keyword search, BM25, metadata filtering, graph traversal, and domain-specific rules. Enterprises also need multi-tier architectures that use caches for common queries, mid-tier vector search for semantic grounding, and cold storage or legacy data sets for long-tail knowledge.
The retrieval layer must behave more like a search engine than a vector lookup. It should choose retrieval methods dynamically, based on the nature of the question, the user’s role, the sensitivity of the data, and the context required for correctness. This is where enterprises often underestimate the complexity. Retrieval becomes its own engineering sub-discipline, on par with devops and data engineering.
Reasoning, grounding, and validation protect answers from drift
Even perfect retrieval does not guarantee a correct answer. LLMs may ignore context, blend retrieved content with prior knowledge, interpolate missing details, or generate fluent but incorrect interpretations of policy text. Production RAG requires explicit grounding instructions, standardized prompt templates, and validation layers that inspect generated answers before returning them to users.
Prompts must be version-controlled and tested like software. Answers must include citations with explicit traceability. In compliance-heavy domains, many organizations route answers through a secondary LLM or rule-based engine that verifies factual grounding, detects hallucination patterns, and enforces safety policies.
Without a structure for grounding and validation, retrieval is only optional input, not a constraint on model behavior.
A blueprint for enterprise-scale RAG
Enterprises that succeed with RAG rely on a layered architecture. The system works not because any single layer is perfect, but because each layer isolates complexity, makes change manageable, and keeps the system observable.
Below is the reference architecture that has emerged through large-scale deployments across fintech, SaaS, telecom, healthcare, and global retail. It illustrates how ingestion, retrieval, reasoning, and agentic automation fit into a coherent platform.
To understand how these concerns fit together, it helps to visualize RAG not as a pipeline but as a vertically integrated stack, one that moves from raw knowledge to agentic decision-making:

Foundry
This layered model is more than an architectural diagram: it represents a set of responsibilities. Each layer must be observable, governed, and optimized independently. When ingestion improves, retrieval quality improves. When retrieval matures, reasoning becomes more reliable. When reasoning stabilizes, agentic orchestration becomes safe enough to trust with automation.
The mistake most enterprises make is collapsing these layers into a single pipeline. That decision works for demos but fails under real-world demands.
Agentic RAG is the next step toward adaptive AI systems
Once the foundational layers are stable, organizations can introduce agentic capabilities. Agents can reformulate queries, request additional context, validate retrieved content against known constraints, escalate when confidence is low, or call APIs to augment missing information. Instead of retrieving once, they iterate through the steps: sense, retrieve, reason, act, and verify.
This is what differentiates RAG demos from AI-native systems. Static retrieval struggles with ambiguity or incomplete information. Agentic RAG systems overcome those limitations because they adapt dynamically.
The shift to agents does not eliminate the need for architecture, it strengthens it. Agents rely on retrieval quality, grounding, and validation. Without these, they amplify errors rather than correct them.
Where RAG fails in the enterprise
Despite strong early enthusiasm, most enterprises confront the same problems. Retrieval latency climbs as indexes grow. Embeddings drift out of sync with source documents. Different teams use different chunking strategies, producing wildly inconsistent results. Storage and LLM token costs balloon. Policies and regulations change, but documents are not re-ingested promptly. And because most organizations lack retrieval observability, failures are hard to diagnose, leading teams to mistrust the system.
These failures all trace back to the absence of a platform mindset. RAG is not something each team implements on its own. It is a shared capability that demands consistency, governance, and clear ownership.
A case study in scalable RAG architecture
A global financial services company attempted to use RAG to support its customer-dispute resolution process. The initial system struggled: retrieval returned outdated versions of policies, latency spiked during peak hours, and agents in the call center received inconsistent answers from the model. Compliance teams raised concerns when the model’s explanations diverged from the authoritative documentation.
The organization re-architected the system using a layered model. They implemented hybrid retrieval strategies that blended semantic and keyword search, introduced strict versioning and metadata policies, standardized chunking across teams, and deployed retrieval observability dashboards that exposed cases where documents contradicted each other. They also added an agent that automatically rewrote unclear user queries and requested additional context when initial retrieval was insufficient.
The results were dramatic. Retrieval precision tripled, hallucination rates dropped sharply, and dispute resolution teams reported significantly higher trust in the system. What changed was not the model but the architecture surrounding it.
Retrieval is the key
RAG is often discussed as a clever technique for grounding LLMs, but in practice it becomes a large-scale architecture project that forces organizations to confront decades of knowledge debt. Retrieval, not generation, is the core constraint. Chunking, metadata, and versioning matter as much as embeddings and prompts. Agentic orchestration is not a futuristic add-on, but the key to handling ambiguous, multi-step queries. And without governance and observability, enterprises cannot trust RAG systems in mission-critical workflows.
Enterprises that treat RAG as a durable platform rather than a prototype will build AI systems that scale with their knowledge, evolve with their business, and provide transparency, reliability, and measurable value. Those who treat RAG as a tool will continue to ship demos, not products.
—
New Tech Forum provides a venue for technology leaders—including vendors and other outside contributors—to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all inquiries to doug_dineley@foundryco.com.
Page processed in 0.059 seconds.
Powered by SimplePie 1.3, Build 20180209064251. Run the SimplePie Compatibility Test. SimplePie is © 2004–2026, Ryan Parman and Geoffrey Sneddon, and licensed under the BSD License.
