Or try one of the following: 詹姆斯.com, adult swim, Afterdawn, Ajaxian, Andy Budd, Ask a Ninja, AtomEnabled.org, BBC News, BBC Arabic, BBC China, BBC Russia, Brent Simmons, Channel Frederator, CNN, Digg, Diggnation, Flickr, Google News, Google Video, Harvard Law, Hebrew Language, InfoWorld, iTunes, Japanese Language, Korean Language, mir.aculo.us, Movie Trailers, Newspond, Nick Bradbury, OK/Cancel, OS News, Phil Ringnalda, Photoshop Videocast, reddit, Romanian Language, Russian Language, Ryan Parman, Traditional Chinese Language, Technorati, Tim Bray, TUAW, TVgasm, UNEASYsilence, Web 2.0 Show, Windows Vista Blog, XKCD, Yahoo! News, You Tube, Zeldman
Postman API platform adds AI-native, Git-based workflows | InfoWorld
Technology insight for the enterprisePostman API platform adds AI-native, Git-based workflows 3 Mar 2026, 2:02 pm
Looking to accelerate API development via AI, Postman has added AI-native, Git-based API workflows to its Postman API platform. The company also introduced the Postman API Catalog, a central system of record that provides a single view of APIs and services across an organization.
The new Postman platform capabilities were announced March 1. With the new release, Postman’s AI-powered intelligence layer, rather than operating as a standalone assistant, now runs inside the platform, with visibility into specifications, tests, environments, and real production behavior, Postman said. Agent Mode in Postman now works with Git repositories to understand API collections, definitions, and underlying code. This reduces manual steps in workflow such as debugging, writing tests, and syncing code with API collections, according to the company.
Additional new AI-native capabilities in the API platform include:
- Native Git workflows to manage API specs, collections, tests, mocks, and environments directly in developers’ Git repos and local file systems.
- AI-powered coordination with Agent Mode across specs, tests, and mocks to automate multi-step changes with broad workflow context, including input provided by MCP servers from Atlassian, Amazon CloudWatch, GitHub, Linear, Sentry, and Webflow.
- Integrated API distribution to publish documentation, workflows, sandboxes, and SDKs in one place.
The new API Catalog, meanwhile, provides a central system of records for APIs and services, delivering enterprise-wide visibility and governance. The API Catalog provides a real-time view of which APIs and services exist, how they are performing, and who owns them, Postman said.
Cloud architects earn the highest salaries 3 Mar 2026, 1:00 am
I’ve watched cloud careers rise and fall with each new wave of tools, from the early “lift-and-shift everything” days to today’s platform engineering, AI-ready data estates, and security-by-default mandates. Through all of it, the role that stays stubbornly in demand is the cloud architect because the hardest part of cloud has never been spinning up resources. The hard part is making hundreds of decisions that won’t quietly compound into outages, cost blowouts, security gaps, or organizational gridlock.
That’s why, even when organizations are moving from cloud to cloud or swapping one set of managed services for another, they still need deep planning capabilities. The platform names change, the service catalogs get refreshed, and vendors repackage features, but the enterprise constraints remain: regulatory obligations, latency and resiliency requirements, identity and access realities, data gravity, contractual risk, and the simple fact that large companies rarely move in a straight line. Cloud architecture is the discipline that prevents transformation programs from becoming expensive improvisation.
Easy to adopt, hard to industrialize
Most companies can get to cloud quickly. A few motivated teams, a credit card, and some well-meaning enthusiasm can produce working workloads in weeks. What you can’t do quickly is scale that success safely across dozens or hundreds of teams while preserving governance, predictable costs, and operational integrity. Industrializing cloud means standardizing patterns without crushing innovation, creating guardrails without blocking delivery, and giving engineers paved roads that are truly easier than off-roading.
This is where architects become force multipliers. In many enterprises, you’ll find dozens of cloud architects assigned across portfolios, projects, and solution development efforts, with a mix of junior and senior levels. Junior architects often focus on implementing reference patterns, helping teams conform to landing zones, and translating standards into deployable templates. Senior architects spend more time shaping the operating model, defining the target architecture, arbitrating trade-offs, and coaching leaders through decisions that ripple across the business.
Compensation follows leverage. In major markets, it’s common to see total annual compensation for experienced cloud architects exceed $200,000, particularly when the role includes broad platform scope, security accountability, and cross-domain influence. One good architect can keep a large organization out of trouble in ways that save far more than the cost of the role.
Daily life of a cloud architect
The best architects don’t “draw diagrams” as an end in itself. They create clarity. On a daily basis, they translate business intent into technical constraints and then into designs that teams can execute. They review solution approaches, challenge hidden assumptions, and ensure that the architecture aligns with the enterprise’s risk posture, delivery maturity, and budget reality.
A typical day includes a steady cadence of conversations and artifacts. There are design reviews where an architect examines network topology, identity flows, encryption boundaries, data classification, and resiliency patterns to verify that a workload won’t fail compliance audits or operational expectations. There are platform decisions about landing zones, shared services, segmentation strategies, private connectivity, and the balance between central control and team autonomy. There is constant attention to cost behavior because architectures don’t just “run.” They consume, and consumption becomes a strategic issue at scale.
Architects also mediate between competing truths. Security wants least privilege and tight controls, product teams want speed, finance wants predictability, and operations wants standardization. The architect’s job is to create a design that meets the business goal with an operationally supportable system. That means documenting nonfunctional requirements, setting service-level objectives, designing for failure, planning disaster recovery, choosing managed services wisely, and preventing accidental complexity.
Another major function is modernization planning. Even when the company is not migrating, it is still evolving: moving from VMs to containers, from containers to serverless, from bespoke data pipelines to managed analytics platforms, or from one identity approach to a unified zero-trust posture. Cloud architects provide the sequencing and the guardrails so that change doesn’t break everything that currently works.
Why demand stays high
Cloud-to-cloud migrations and moves from technology to technology within the cloud are often driven by economics, risk, mergers and acquisitions, data residency, or strategic leverage against a vendor. These moves are rarely clean. They involve interoperability, phased cutovers, temporary duplication, and years of coexistence. In that environment, teams can’t just chase feature parity; they need an architectural blueprint that defines what “done” means and how to get there without creating a brittle, duplicated mess.
Architects are also the antidote to the myth that cloud decisions are reversible. In theory, everything is abstracted. In reality, organizations build around specific services, identity and access management, logging pipelines, networking constructs, and operational habits. Those become sticky. An architect anticipates stickiness and designs for it, using patterns that preserve options where it matters and committing deliberately where the payoff is worth it.
This is also why advancement opportunities are so strong. As architectures grow, the role naturally expands into platform leadership, cloud center of excellence direction, principal architect positions, and enterprise architecture. The most valuable architects become trusted advisors because they can connect strategy to execution without hand-waving.
How to become a cloud architect
Start by building depth in fundamentals and breadth in systems thinking. You can’t architect what you don’t understand, so get hands-on with networking, identity, security, and observability, not just compute and storage. Learn how systems fail, how incidents are managed, and how costs emerge from architecture, because those realities shape every good design.
Next, accumulate “pattern experience.” Build and operate a few real systems end to end, then document what you learned. What would you standardize? What would you avoid? Which trade-offs surprised you? Architecture is applied judgment, and judgment comes from seeing consequences over time. Pair that with structured learning, including cloud provider certifications if they help you organize your knowledge, but don’t confuse badges with mastery. The goal is to be fluent in a cloud’s primitives while remaining capable of designing across clouds and across organizational boundaries.
Finally, develop the communication skills that turn architecture into outcomes. Learn to write clear decision records, present trade-offs without drama, and negotiate constraints with empathy. The strongest architects are credible because they can meet teams where they are, raise the maturity level pragmatically, and keep the enterprise moving forward without creating bureaucracy.
Cloud architects remain in such high demand because they reduce risk, prevent costly missteps, and make cloud adoption scalable and repeatable. Their daily work blends technical design, governance, cost, security, and cross-team alignment. If you want the role, build strong fundamentals, collect real-world pattern experience, and master the communication skills that turn diagrams into dependable systems.
Under the hood with .NET 11 Preview 1 3 Mar 2026, 1:00 am
.NET’s annual cadence has given the project a solid basis for rolling out new features, as well as a path for improving its foundations. No longer tied to Windows updates, the project can provide regular previews alongside its bug and security fixes, allowing us to get a glimpse of what is coming next and to experiment with new features. At the same time, we can see how the upcoming platform release will affect our code.
The next major release, .NET 11, should arrive in November 2026, and the project recently unveiled its first public preview. Like earlier first looks, it’s nowhere near feature complete, with several interesting developments marked as “foundational work not yet ready for general use.” Unfortunately, that means we don’t get to play with them in Preview 1. Work is continuing and can be tracked on GitHub.
What’s most interesting about this first preview isn’t new language features (we’ll learn more about those later in the year), but rather the underlying infrastructure of .NET: the compiler and the run time. Changes here reveal intentions behind this year’s release and point to where the team thinks we’ll be running .NET in the future.
Changes for Android and WebAssembly
One big change for 2026 is a move away from the Mono runtime for Android .NET applications to CoreCLR. The modern .NET platform evolved from the open source Mono project, and even though it now has its own runtime in CoreCLR, it has used the older runtime as part of its WebAssembly (Wasm) and Android implementations.
Switching to CoreCLR for Android allows developers to get the same features on all platforms and makes it easier to ensure that MAUI behaves consistently wherever it runs. The CLR team notes that as well as improving compatibility, there will be performance improvements, especially in startup times.
For Wasm, the switch should again make it easier to ensure common Blazor support for server-side and for WebAssembly code, simplifying the overall application development process. The project to make this move is still in its early stages, with an initial SDK and interoperability work complete. There’s still a lot to do before it’ll be possible to run more than “Hello World” using CoreCLR on Wasm and WebAssembly System Interface (WASI). The project aims to have support for RyuJIT by the end of the .NET 11 development cycle.
Full support won’t arrive until .NET 12, and having a .NET runtime that’s code-compatible with the rest of .NET for WebAssembly is a big win for both platforms. You should treat the .NET 11 WASM CoreCLR capabilities as a preview, allowing you to experiment with various scenarios and use those experiments to help guide future development.
Native support for distributed computing
One of the more interesting new features appears to be a response to changes in the ways we build and deliver code. Much of what we build still runs on one machine, especially desktop applications, although more and more code needs to interact with external APIs. That code must run asynchronously so that an API call doesn’t become a blocker and hold up a user’s PC or device while an application waits for a response from a remote server. Operating this way is even more important for cloud-native applications, which are often loosely connected sets of microservices managed by platforms like Kubernetes or serverless Functions on Azure or another cloud platform.
.NET 11’s CoreCLR is being re-engineered to improve support for this increasingly important set of design patterns. Earlier releases needed explicit permission to use runtime asynchronous support in the CLR. In Preview 1, runtime async on CoreCLR is enabled by default; you don’t need to do anything to test how your code works with this feature, apart from installing the preview bits and using them with your applications.
For now, this new tool is limited to your own code, as core libraries are still compiled without runtime async support. That will change in the next few months as libraries are recompiled and added to future previews. Third-party code will most likely wait until Microsoft releases a preview with a “go live” license.
You can get a feel for how this feature is progressing by reading what the documentation describes as an “epic issue.” This lists the current state of the feature and what steps need to be completed. Work began during the .NET 10 timeframe, so much of the foundational work has been completed, although several key features are still listed as issues, including just-in-time support on multicore systems as well as certain key optimization techniques, such as when there’s a need to respond to actual workloads, recompiling on the fly using profile-guided optimization.
It’s important to note that issues like these are a small part of what needs to be delivered to land runtime async support in .NET 11. With several months between Preview 1’s arrival and the final general availability release, the .NET team has plenty of time to deliver these pieces.
With the feature in development, you’ll still need to set project file flags to enable support in ahead-of-time (AOT) compiled applications. This entails adding a couple of lines to the project file and then recompiling the application. For now, it’s a good idea to build and test with AOT runtime async and then recompiling when you are ready to try out the new feature.
Changes to hardware support
One issue to note is that the updated .NET runtime in .NET 11 has new hardware requirements, and older hardware may not be compatible. It needs modern instruction sets. Arm64 now requires armv8.0-a with LSE (armv8.2-a with RCPC on Windows and M1 on macOS), and x64 on Windows and Linux needs x86-64-v3.
This is where you might find some breaking changes, as older hardware will now give an error message and code will not run. This shouldn’t be an issue for most modern PCs, devices, and servers, as these requirements align with .NET’s OS support, rather than supporting older hardware that’s becoming increasingly rare. However, if you’re running .NET on hardware that’s losing support, you will need to upgrade or stick with older code for another year or two.
There are other hardware platforms that get .NET support, with runtimes delivered outside of the environment. This includes support for RISC-V hardware and IBM mainframes. For now, both are minority interests: one to support migrations and updates to older enterprise software, and one to deliver code on the next generation of open hardware. It’ll be interesting to see if RISC-V support becomes mainstream, as silicon performance is improving rapidly and RISC-V is already available in common Internet of Things development boards and processors from organizations and companies like Raspberry Pi, where it is part of the RP2350 microcontroller system on a chip.
Things like this make it interesting to read the runtime documentation at the start of a new cycle of .NET development. By reading the GitHub issues and notes, we can see some of the thinking that goes into .NET and can take advantage of the project’s open design philosophy to plan our own software development around code that won’t be generally available until the end of the year.
It’s still important to understand the underpinnings of a platform like .NET. The more we know, the more we can see that a lot of moving parts come together to support our code. It’s useful to understand where we can take advantage of compilers and runtimes to improve performance, reliability, and reach.
After all, that’s what the teams are doing as they build the languages we will use to write our applications as .NET moves on to another preview, another step on the road to .NET 11’s eventual release.
Why AI requires rethinking the storage-compute divide 3 Mar 2026, 1:00 am
For more than a decade, cloud architectures have been built around a deliberate separation of storage and compute. Under this model, storage became a place to simply hold data while intelligence lived entirely in the compute tier.
This design worked well for traditional analytics jobs operating on structured, table-based data. These workloads are predictable, often run on a set schedule, and involve a smaller number of compute engines operating over the datasets. But as AI reshapes enterprise infrastructure and workload demands, shifting data processing toward massive volumes of unstructured data, this model is breaking down.
What was once an efficiency advantage is increasingly becoming a structural cost.
Why AI exposes the cost of separation
AI introduces fundamentally different demands than the analytics workloads businesses have grown accustomed to. Instead of tables and rows processed in batch jobs by an engine, modern AI pipelines now process large amounts of unstructured and multimodal data, while also generating large volumes of embeddings, vectors, and metadata. At the same time, processing is increasingly continuous, with many compute engines touching the same data repeatedly—each pulling the data out of storage and reshaping it for its own needs.
The result isn’t just more data movement between storage and compute, but more redundant work. The same dataset might be read from storage, transformed for model training, then read again and reshaped for inference, and again for testing and validation—each time incurring the full cost of data transfer and transformation. Given this, it’s no surprise that data scientists spend up to 80% of their time just on data preparation and wrangling, rather than building models or improving performance.
While these inefficiencies can be easy to overlook at a smaller scale, they quickly become a primary economic constraint as AI workloads grow, translating not only into wasted hours but real infrastructure cost. For example, 93% of organizations today say their GPUs are underutilized. With top-shelf GPUs costing several dollars per hour across major cloud platforms, this underutilization can quickly compound into tens of millions of dollars of paid-for compute going to waste. As GPUs increasingly dominate infrastructure budgets, architectures that leave them waiting on I/O become increasingly difficult to justify.
From passive storage to smart storage
The inefficiencies exposed by AI workloads point to a fundamental shift in how storage and compute must interact. Storage can no longer exist solely as a passive system of record. To support modern AI workloads efficiently and get the most value out of the data that companies have at their disposal, compute must move closer to where data already lives.
Industry economics make this clear. A terabyte of data sitting in traditional storage is largely a cost center. When that same data is moved into a platform with an integrated compute layer, its economic value increases by multiples. The data itself hasn’t changed; the only difference is the presence of compute that can transform that data and serve it in useful forms.
Rather than continuing to move data to capture that value, the answer is to bring compute to the data. Data preparation should happen once, where the data lives, and be reused across pipelines. Under this model, storage becomes an active layer where data is transformed, organized, and served in forms optimized for downstream systems.
This shift changes both performance and economics. Pipelines move faster because data is pre-prepared. Hardware stays more productive because GPUs spend less time waiting on redundant I/O. The costs of repeated data preparation begin to disappear.
Under this new model, “smart storage” changes data from something that is merely stored to a resource that is continuously understood, enriched, and made ready for use across AI systems. Rather than leaving raw data locked in passive repositories and relying on external pipelines to interpret it, smart storage applies compute directly within the data layer to generate persistent transformations, metadata, and optimized representations as data arrives.
By preparing data once and reusing it across workflows, organizations allow storage to become an active platform instead of a bottleneck. Without this shift, organizations remain trapped in cycles of redundant data processing, constant reshaping, and compounding infrastructure cost.
Preparing for AI-era infrastructure
The cloud’s separation of storage and compute was the right architectural decision for its time. But AI workloads have fundamentally changed the economics of data and exposed the limits of this approach—a constraint I’ve watched kill numerous enterprise AI initiatives, and a core reason I founded DataPelago.
While the industry has begun focusing on accelerating individual steps in the data pipeline, efficiency is no longer determined by squeezing marginal gains from existing architectures. It is now determined by building new architectures that make data usable without repeated preparation, excessive movement, or wasted compute. As AI’s demands continue to crystallize, it is becoming increasingly clear that the next generation of infrastructure will be defined by how intelligently storage and compute are brought together.
The companies that succeed will be the ones that make smart storage a foundation of their AI strategy.
—
New Tech Forum provides a venue for technology leaders—including vendors and other outside contributors—to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all inquiries to doug_dineley@foundryco.com.
Rust developers have three big worries – survey 2 Mar 2026, 2:20 pm
Rust developers are mostly satisfied with the current pace of evolution of the programming language, but many worry that Rust does not get enough usage in the tech industry, that Rust may become too complex, and that the developers and maintainers of Rust are not properly supported.
These findings are featured in the Rust Survey Team’s 2025 State of Rust Survey report, which was announced March 2. The survey ran from November 17, 2025, to December 17, 2025, and tallied 7,156 responses, with different numbers of responses for different questions.
Asked their opinion of the pace at which the Rust language is evolving, 57.6% of the developers surveyed reported being satisfied with the current pace, compared to 57.9% in the 2024 report. Asked about their biggest worries for the future of Rust, 42.1% cited not enough usage in the tech industry, compared to 45.5% in 2024. The other biggest worries were that Rust may become too complex (41.6% in 2025 versus 45.2% in 2024) and that the developers and maintainers of Rust are not properly supported (38.4% in 2025 versus 35.4% in 2024).
The survey also asked developers which aspects of Rust present non-trivial problems to their programming productivity. Here slow compilation led the way, with 27.9% of developers saying slow compilation was a big problem and 54.68% saying that compilation could be improved but did not limit them. High disk space usage and a subpar debugging experience were also top complaints, with 22.24% and 19.90% of developers citing them as big problems.
In other findings in the 2025 State of Rust Survey report:
- 91.7% of respondents reported using Rust in 2025, down from 92.5% in 2024. But 55.1% said they used the language daily or nearly daily last year, up from 53.4% in 2024.
- 56.8% said they were productive using Rust in 2025, compared to 53.5% in 2024.
- When it comes to operating systems in 2025, 75.2% were using Linux regularly; 34.1% used macOS, and 27.3% used Windows. Linux also was the most common target of Rust software development, with 88.4% developing for Linux.
- 84.8% of respondents who used Rust at work said that using Rust has helped them achieve their goals.
- Generic const expressions was the leading unimplemented or nightly-only feature that respondents in 2025 were looking to see stabilized, with 18.35% saying the feature would unblock their use case and 41.53% saying it would improve their code.
- Visual Studio Code was the IDE most commonly used to code with Rust on a regular basis in 2025, with 51.6% of developers favoring it.
- 89.2% reported using the most current version of Rust in 2025.
Buyer’s guide: Comparing the leading cloud data platforms 2 Mar 2026, 7:30 am
Choosing the right data platform is critical for the modern enterprise. These platforms not only store and protect enterprise data, but also serve as analytics engines that source insights for pivotal decision-making.
There are many offerings on the market, and they continue to evolve with the advent of AI. However, five prominent players — Databricks, Snowflake, Amazon Redshift, Google BigQuery, and Microsoft Fabric — stand out as the leading options for your enterprise.
Databricks
Founded in 2013 by the creators of the open-source analytics platform Apache Spark, Databricks has established itself as one of the dominant players in the data market. Notably, the company coined the term and developed the concept of a data lakehouse, which combines the capabilities of data lakes and data warehouses to give enterprises a better handling of their data estates.
Data lakehouses create a single platform incorporating both data lakes (where large amounts of raw data are stored) and data warehouses (which contain categories of structured data) that typically operate as separate architectures. This unified system allows enterprises to query all data sources together and govern the workloads that use that data.
The lakehouse has become its own category and is now widely used and incorporated into many IT stacks.
Databricks presents itself as a “data+AI” company, and calls itself the only platform in the industry featuring a unified governance layer across data and AI, as well as a single unified query engine across ML, BI, SQL, and ETL.
Databricks’ Data Intelligence Platform has a strong focus on ML/AI workloads and is deeply tied to the Apache Spark ecosystem. Its open, flexible environment supports almost any data type and workload.
Further, to support the agentic AI era, Databricks has rolled out a Mosaic-powered Agent Bricks offering, which gives users tools to deploy customized AI agents and systems based on their unique data and needs. Enterprises can use retrieval-augmented generation (RAG) to build agents on their custom data and use Databricks’ vector database as a memory function.
Core platform: Databricks’ core offering is its Data Intelligence Platform, which is cloud-native — meaning it was designed from the get-go for cloud computing — and built to understand the semantics of enterprise data (thus the “intelligence” part).
The platform sits on a lakehouse foundation and open-format software interfaces (Delta Lake and Apache Iceberg) that support standardized interactions and interoperability. It also incorporates Databricks’ Unity Catalog, which centralizes access control, quality monitoring, data discovery, auditing, lineage, and security.
DatabricksIQ, Databricks’ Data Intelligence Engine, fuels the platform. It uses generative AI to understand semantics, and is based on innovations from MosaicML, which Databricks acquired in 2023.
Deployment method: Databricks is a built-on cloud platform that has established partnerships with the top cloud providers, including Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP).
Pricing: A pay-as-you-go model with no upfront costs. Customers only pay for the products they use at “per second granularity.” There are different pricing-per-unit options for data engineering, data warehousing, interactive workloads, AI, and operational databases (ranging from .07 to .40). Databricks also offers committed use contracts that provide discounts when customers commit to certain levels of usage.
Challenges/trade-offs: Operation can be more complex and less “plug and play”: Users are essentially running an Apache Spark-based platform, so there’s more to manage than in serverless environments that are easier to operate and tune. Pricing models can tend to be more complex.
Additional considerations for Databricks
- A unified stack provides data pipelines, feature engineering, BI, ML training, and other complex tasks on the same storage layer.
- Support for open formats and engines — including Delta and Iceberg — doesn’t lock users into a storage engine.
- Unity Catalog provides a common governance layer, and data descriptions and tags can help the platform learn an enterprise’s unique semantics.
- Agent Bricks and MLflow offer a strong AI and ML toolkit.
Snowflake
Snowflake, founded in 2013, is considered a pioneer in cloud data warehousing, serving as a centralized repository for both structured and semi-structured data that enterprises can easily access for analysis and business intelligence (BI).
The company is considered a direct competitor to Databricks. In fact, as a challenge to the data lakehouse pioneer, Snowflake claims it has always been a hybrid of data warehouses and data lakes.
Core platform: Snowflake positions itself as an ‘AI Data Cloud’ that can manage all data-driven enterprise activities. Like Databricks, its platform is cloud-native and it unifies storage, elastic compute, and cloud services.
Snowflake can support AI model development (notably through its agent-builder platform Cortex AI), advanced analytics, and other data-heavy tasks. Its Snowgrid cross-cloud layer supports global connectivity across different regions and clouds (thus allowing for consistency in performance) while a Snowflake Horizon governance layer manages access, security, privacy, compliance, and interoperability.
Integrated Snowpipe and Openflow capabilities allow for real-time ingestion, integration, and streaming, while Snowpark Connect supports migration and interoperability with Apache Spark codebases. Further, Cortex AI allows users to securely run large language models (LLMs) and build generative AI and agentic apps.
Deployment method: Like Databricks, Snowflake has partnerships with major players, running as software-as-a-service (SaaS) on AWS, Azure, GCP, and other cloud providers. Notably, a key strategic partnership with Microsoft allows customers to buy and run Azure Databricks directly and integrate with other Azure services.
Pricing: A consumption-based pricing model. Customers are charged for compute in credits costing $2 and up based on subscription edition (standard, enterprise, business critical or virtual private Snowflake) and cloud region. A monthly fee for data stored in Snowflake is calculated based on average use.
Snowflake strengths: Snowflake positions itself as a turnkey, managed SQL platform for data‑intensive applications with strong governance and minimal tuning required.
Further, the company continues to innovate in the agentic AI era. For instance, Snowflake Intelligence allows users to ask questions, and get answers, about their data in natural language. Cortex AI provides secure access to leading LLMs: Teams can call models, perform text-to-SQL commands, and run RAG inside Snowflake without exposing their data.
Snowflake challenges/trade-offs
- Snowflake’s proprietary storage and compute engine are less open and controllable than a lakehouse environment.
- Cost can be difficult to visualize and manage due to credit-based pricing and serverless add-ons.
- Users have reported weaker support for unstructured data and data streaming.
Additional considerations for Snowflake
- Elastic compute provides strong performance for numerous users, data volumes, and workloads in a single, scalable engine.
- There’s little infrastructure to manage: Snowflake abstracts away most capabilities, such as optimization, planning, and authentication.
- Storage is interoperable and users get un-siloed access.
- Snowgrid capabilities work across regions and clouds — whether AWS, Azure, GPC, or others — to allow for data sharing, portable workloads, and consistent global policies.

These five platforms are the dominant leaders in the cloud data ecosystem. While they all handle large-scale analytics, they differ significantly in their architecture (e.g.,warehouse vs. lakehouse), ecosystem ties, and target users.
Foundry
Amazon Redshift
Amazon Web Services (AWS) Redshift is Amazon’s fully managed, petabyte-scale cloud data warehouse designed to replace more complex, expensive on-premises legacy infrastructure.
Core platform: Amazon Redshift is a queryable data warehouse optimized for large-scale analytics on massive datasets. It is built on two core architectural pillars: columnar storage and massively parallel processing (MPP). Content is organized in different nodes (columns) and MPP can quickly process these datasets in tandem.
Redshift uses standard SQL to interact with data in relational databases and integrates with extract, transform, load (ETL) tools — like AWS Glue — that manage and prepare data. Through its Amazon Redshift Spectrum feature, users can directly query data from files on Amazon Simple Storage (Amazon S3) without having to load data into tables.
Additionally, with Amazon Redshift ML, developers can use simple SQL language to build and train Amazon SageMaker machine learning (ML) based on their Redshift data.
Redshift is deeply integrated in the AWS ecosystem, allowing for easy interoperability with numerous other AWS services.
Deployment method: Amazon Redshift is fully-managed by AWS and is offered in both provisioned (a flat determined rate for a set amount of resources, whether used or not) and serverless (pay-per-use) options.
Pricing: Offers two deployment options, provisioned and serverless. Provisioned starts at $0.543 per hour, while serverless begins at $1.50 per hour. Both options scale to petabytes of data and support thousands of concurrent users.
Amazon Redshift strengths: AWS Redshift’s main differentiator is its strong integration in the broader AWS ecosystem: It can easily be connected with S3, Glue, SageMaker, Kinesis data streaming, and other AWS services. Naturally, this makes it a good fit for enterprises already leaning heavily into AWS. They can securely access, combine, and share data with minimal movement or copying.
Further, AWS has introduced Amazon Q, a generative AI assistant with specialized capabilities for software developers, BI analysts, and others building on AWS. Users can ask Amazon Q about their data to make decisions, speed up tasks and, ideally, increase productivity.
Amazon Redshift challenges/trade-offs
- Ecosystem lock-in: While it fits quickly and easily into the AWS environment, Redshift might not be a good fit for enterprises with multi-cloud or cloud-agnostic strategies.
- Even as it is managed by AWS, though, users say it is not as hands-off as other options. Some compaction tasks must be run manually (vacuum), ETL processes must be checked regularly, and continuous monitoring of unusual queries can negatively impact service performance.
Additional considerations for Redshift
- Devs find Redshift easy to use because of its SQL backbone.
- The platform is highly-performance and scalable thanks to its columnar architecture, decoupled compute and storage, and MPP.
- AWS offers flexible deployment options: provisioned clusters for more predictable workloads, serverless for spikier ones.
- Zero-ETL capabilities simplify data ingestion without complex pipelines, thus supporting near real-time analytics.
Google BigQuery
Google BigQuery started out as a fully managed cloud data warehouse that Google now sells as an autonomous data and AI platform that automates the entire data lifecycle.
Core platform: Google BigQuery is a serverless, distributed, columnar data warehouse optimized for large‑scale, petabyte-scale workloads and SQL‑based analytics. It is built on Google’s Dremel execution engine, allowing it to allocate queries on an as-needed basis and quickly analyze terabytes of data with fewer resources.
BigQuery decouples compute (Dremel) and storage, housing data in columns in Google’s distributed file system Colossus. Data can be ingested from operational systems, logs, SaaS tools, and other sources, typically via extract, transform, load (ETL) tools.
BigQuery uses familiar SQL commands, allowing developers to easily train, evaluate, and run ML models for capabilities like linear regression and time-series forecasting for prediction, and k-means clustering for analytics. Combined with Vertex AI, the platform can perform predictive analytics and run AI workflows on top of warehouse data.
Further, BigQuery can integrate agentic AI, such as pre-built data engineering, data science, analytics, and conversational analytics agents, or devs can use APIs and agent development kit (ADK) integrations to create customized agents.
Deployment method: BigQuery is fully-managed by Google and serverless by default, meaning users do not need to provision or manage individual servers or clusters.
Pricing: Offers three pricing tiers. Free users get up to 1 tebibyte (TiB) of queries per month. On-demand pricing (per-TiB) charges customers based on the number of bytes processed by each query. Capacity pricing (per slot-hour) charges customers based on compute capacity used to run queries, measured in slots (virtual CPUs) over time.
Google BigQuery strengths: BigQuery is deeply coupled with the GCP ecosystem, making it an easy choice for enterprises already heavily using Google products. It is scalable, fast, and truly serverless, meaning customers don’t have to manage or provision infrastructure.
GCP also continues to innovate around AI: BigQuery ML (BQML) helps analysts build, train, and launch ML models with simple SQL commands directly in the interface, and Vertex AI can be leveraged for more advanced MLOps and agentic AI workflows.
Google BigQuery challenges/trade-offs
- Costs for heavy workloads can be unpredictable, requiring discipline around partitioning and clustering.
- Users report difficulties around testing and schema mismatches during ETL processes.
Other considerations for BigQuery
- BigQuery can analyze petabytes of data in seconds because its architecture decouples storage (Colossus) and compute (Dremel engine).
- Google automatically handles resource allocation, maintenance, and scaling, so teams do not have to focus on operations.
- Flexible payment models cover both predictable or more sporadic workflows.
- Standard SQL support means analysts can use their existing skills to query data without retraining.
Microsoft Fabric
Microsoft Fabric is a SaaS data analytics platform that integrates data warehousing, real-time analytics, and business intelligence (BI). It is built on OneLake, Microsoft’s “logical” data lake that uses virtualization to provide users a single view of data across systems.
Core platform: Fabric is delivered via SaaS and all workloads run on OneLake, Microsoft’s data lake built on Azure Data Lake Storage (ADLS). Fabric’s catalog provides centralized data lineage, discovery, and governance of analytics artifacts (tables, lakehouses and warehouses, reports, ML tools).
Several workloads run on top of OneLake so that they can be chained without moving data across services. These include a data factory (with pipelines, dataflows, connectors, and ETL/ELT to ingest and process data); a lakehouse with Spark notebooks and pipelines for data engineering on a Delta format; and a data warehouse with SQL endpoints, T‑SQL compatibility, clustering and identity columns, and migration tooling.
Further, real-time intelligence based on Microsoft’s Eventstream and Activator tools ingest telemetry and other Fabric events without the need for coding; this allows teams to monitor data and automate actions. Microsoft’s Power BI sits natively on OneLake, and a DirectLake feature can query lakehouse data without importing or dual storage.
Fabric also integrates with Azure Machine Learning and Foundry so users can develop and deploy models and perform inferencing on top of Fabric datasets. Further, the platform features integrated Microsoft Copilot agents. These can help users write SQL queries, notebooks, and pipelines; generate summaries and insights; and populate code and documentation.
Microsoft recommends a “medallion” lakehouse architecture in Fabric. The goal of this type of format is to incrementally improve data structure and quality. The company refers to it as a “three-stage” cleaning and organizing process that makes data “more reliable and easier to use.”
The three stages include: Bronze (raw data that is stored exactly as it arrives); Silver (cleaned, errors fixed, formats standardized, and duplicates removed); and Gold (curated and ready to be organized into reports and dashboards.
Deployment method: Fabric is offered as a SaaS fully managed by Microsoft and hosted in its Azure cloud computing platform.
Pricing: A capacity-based licensing model (FSKUs) with two billing options: flexible pay-as-you-go that is billed per second and can be scaled up or paused; and reserved capacity, prepaid 1 to 3 year plans that can offer up to 40 to 50% savings for predictable workloads. Data storage in OneLake is typically priced separately.
Microsoft Fabric strengths
- Explicitly designed as an all‑in‑one SaaS, meaning one platform for ingestion, lakehouse, warehouse, and real‑time ML and BI.
- Built-in Copilot can help accelerate common tasks (such as documentation or SQL), which users report as an advantage over competitors whose AI tools aren’t as tightly-integrated.
- Microsoft recommends and documents medallion architecture, with lake views that automate evolutions from bronze to silver to gold.
Microsoft Fabric challenges/trade-offs
- Fabric is newer (released in GA in 2023); users complain that some features feel early-stage, and documentation and best practices aren’t as evolved.
- Can lead to lock-in the Microsoft stack, which makes it less appealing to enterprises looking for more open, multi‑cloud tools like Databricks or Snowflake.
- Because pricing is capacity/consumption‑based, careful FinOps may be necessary to avoid surprises.
Other considerations for Microsoft Fabric
- Direct lake mode allows Power BI to analyze massive datasets directly from OneLake memory without the “import/refresh” cycles required by other platforms.
- This Zero-ETL feature allows Fabric to virtualize data from Snowflake, Databricks, or Amazon S3. You can see and query your Snowflake tables inside Fabric without moving a single byte of data.
- Copilot Integration: Native AI assistants help users write Spark code, build data factory pipelines, and even generate entire Power BI reports from natural language prompts.
Bottom line
Choosing the right cloud data platform is a strategic decision extending beyond simple storage and access. Leading providers now blend data stores, governance layers, and advanced AI capabilities, but they differ when it comes to operational complexity, ecosystem integration, and pricing.
Ultimately, the right choice depends on an organization’s individual cloud strategy, operational maturity, workload mix, AI ambitions, and ecosystem preference — lock-in versus architectural flexibility.
FinOps for agents: Loop limits, tool-call caps and the new unit economics of agentic SaaS 2 Mar 2026, 2:00 am
The first time my team shipped an agent into a real SaaS workflow, the product demo looked perfect. The production bill did not. A small percentage of sessions hit messy edge cases, and our agent responded the way most agents do: it tried harder. It re-planned, re-queried, re-summarized and retried tool calls. Users saw a slightly slower response, and finance saw a step-change in variable spend.
That week changed how we think about agent design. In agentic SaaS, cost is a reliability metric. Loop limits and tool-call caps protect your margin.
I call this discipline FinOps for Agents: a practical way to govern loops, tools and model spend so your gross margin survives contact with real customers. I have found progress comes from putting product, engineering and finance in the same room, replaying agent traces and agreeing on guardrails that define the user experience.
Why does FinOps look different for agentic SaaS?
Measuring the Cost of Goods Sold (COGS) for classic SaaS is well known: compute, storage, third‑party services and support. Agentic SaaS adds a new axis: cognition. Every plan, reflection step, retrieval pass and tool call burns tokens and ambiguity often pushes agents to do more work to resolve it.
FinOps practitioners are increasingly treating AI as its own cost domain. The FinOps Foundation highlights token-based pricing, cost-per-token and cost-per-API-call tracking and anomaly detection as core practices for managing AI spend.
Seat count still matters, yet I have watched two customers with the same licenses generate a 10X difference in inference and tool costs because one had standardized workflows and the other lived in exceptions. If you ship agents without a cost model, your cloud invoice quickly becomes the lesson plan.
The agentic COGS stack
As head of AI R&D, I spend a lot of time with architects and CTOs, and the conversation almost always lands on a COGS breakdown that mirrors the agent’s architecture:
- Model inference: Tokens across planner/executor/verifier calls, usually the largest contributor to COGS of agentic software
- Tools and side effects: Paid APIs (e.g., web search), per-record automation fees, retries and idempotent write safeguards.
- Orchestration runtime: Workers, queues, state storage and sandboxed execution for code and documents.
- Memory and retrieval: Embeddings, vector storage, index refresh and context-building or summarization checkpoints.
- Governance and observability: Tracing, evaluation suites, safety filters and audit retention.
- Humans in the loop: Review time, escalations and support load created by agent mistakes.
How does FinOps help standardize unit economics when outcomes span actions, workflows and tasks?
Gartner has cautioned that cost pressure can derail agentic programs, which makes unit economics a delivery requirement.
When it comes to most SaaS products, customers don’t buy raw tokens; instead, they buy progress toward completing their work, e.g., cases resolved, pipelines updated, reports produced or exceptions handled. Unit economics becomes actionable when we measure at the boundary where that value is delivered, and that boundary expands as your agentic SaaS matures: from answers in the UI, to a single approved operation, to a multi-step process and eventually to a recurring responsibility the agent runs end-to-end. In the following table, we lay out this structure and the corresponding unit metric and outcome to meter at each level of scope.
Where to meter: Actions, workflows and tasks
| Scope of integration | What it means | Example | Unit economics | What outcomes to meter |
| Assistance | The user asks, AI answers. No integration. | “Brief me on Acme: last touchpoints, open opp status and the next best step.” | Cost per query. | Seats. |
| Wrap an action | AI proposes one operation. Users generally approve or decline. | “Update this opportunity to Proposal, set the close date to Feb 15 and create a follow-up task.” | Cost per approved action. | Actions executed. |
| Wrap a workflow | AI assists across a multi-step process. | “When a new inbound lead arrives, enrich it, score fit, route to the right rep and start the first-touch sequence.” | Cost per workflow. | Workflows completed. |
| Wrap a task | AI owns a recurring responsibility. | “Run weekly pipeline hygiene end-to-end: fix missing fields, merge duplicates, advance stale stages and only ask me about exceptions.” | Cost per run. | Tasks × frequency, hours saved |
The FinOps metric product and finance agree on: CAPO, the cost-per-accepted-outcome
In early pilots, teams obsess over token counts. However, for a scaled agentic SaaS running in production, we need one number that maps directly to value: Cost-per-Accepted-Outcome (CAPO). CAPO is the fully loaded cost to deliver one accepted outcome for a specific workflow.
The phrase “accepted outcome” matters. A run that completes quickly and produces the wrong answer still consumes tokens, retrieval and tool calls. I define acceptance as a concrete quality gate: automated validation, a user “Apply” click or a downstream success signal such as “case not reopened in 7 days.”
Forrester’s FinOps research highlights the importance of operating-model maturity and step-by-step practice building for cost optimization for agentic software.
We calculate CAPO per workflow and per segment, then watch the distribution, not just the average. Median tells us where the product feels efficient. P95 and P99 tell us where loops, retries and tool storms are hiding.
Note, failed runs belong in CAPO automatically since we treat the numerator as total fully loaded spend for that workflow (accepted + failed + abandoned + retried) and the denominator as accepted outcomes only, so every failure is “paid for” by the successes.
Tagging each run with an outcome state (accepted, rejected, abandoned, timeout, tool-error) and attributing its cost to a failure bucket allows us to track Failure Cost Share (failed-cost ÷ total-cost) alongside CAPO and see whether the problem is acceptance rate, expensive failures or retry storms.
These metrics naturally translate to measurable targets that inference engineering teams can rally behind.
Which budget guardrails keep FinOps off your back?
A well-designed agent has a budget contract the way a well-run service has an SLO. I encode that contract in five guardrails, enforced at the gateway where every model and tool call flows:
- Loop/step limit: Cap planning, reflection and verification cycles. Escalate or ask a clarifying question when hit.
- Tool-call cap: Cap total paid actions per run, with stricter sub‑caps for expensive tools like search and long-running automations.
- Token budget: Enforce a per‑run token ceiling across calls and summarize history instead of re-sending transcripts.
- Wall‑clock timeout: Keep interactive flows snappy and push long work into explicit background jobs with status updates.
- Tenant budgets and concurrency: Limit blast radius with per‑tenant caps and anomaly alerts. CSPs like AWS have vastly improved
- Tenant budgets and concurrency: Limit blast radius with per-tenant caps and FinOps anomaly alerts. CSPs like AWS have announced vastly improved Cost Anomaly Detection for inference services at re:Invent in December 2025.
How can interaction design and user experience drive FinOps savings?
Most FinOps savings come from architecture and interaction design, not from arguing over pennies per million tokens.
“Having comprehensive evals allows you to compare your product performance across LLMs and guide what LLMs you can use. The biggest cost saver is defaulting to the smallest possible model for data analysis while maintaining performance and accuracy, while still allowing customers to override and select the model of their choice,” says Geoffrey Hendrey, CEO of AlertD.
Three patterns consistently flatten the cost curve for us:
- Separate planning from execution. A planner can be context‑heavy and cheap, whereas an executor can be tool‑constrained and action‑oriented. This reduces “thinking while acting” loops and makes retries easier to reason about.
- Route work to the smallest capable model. Extraction, validation and routing succeed with smaller models when you use structured outputs. Reserve larger models for synthesis and edge cases that fail validation.
- Make tools idempotent and cacheable. Add idempotency keys to every write. Cache repeated reads inside a run. Tool-call caps become practical when retries stay safe.
Premium lane: Pricing that keeps your agent profitable
I expect many teams to keep seat-based pricing because procurement teams understand it. Predictable margin comes from attaching explicit entitlements to those seats and creating a controlled premium lane for expensive behavior.
- Seats plus allowances: Bundle a monthly budget of agent runs or action credits. Throttle or upsell when exceeded.
- Usage add‑ons: Sell metered AI as a separate SKU so power users fund their own tail behavior. Tread with caution here as you don’t want to add friction to adoption.
- Premium lane policy: Reserve premium models for high‑stakes tasks or failed validation paths, backed by a paid tier. Make sure deployments used for demos are on the paid tier.
How does FinOps mature from cost visibility to ROI?
As you mature, pricing shifts from bundled access to outcomes that map directly to customer value.
FinOps focus shifts in parallel from adoption-driven cost volatility to unit economics, acceptance integrity and forecastable margin.
| Maturity level | What you sell to customers | What FinOps cares about | What can go wrong |
| Seat-bundled | “Agents are included with the license.” | Gross margin volatility by adoption, cohort and workflow mix. | A few heavy workflows or tenants quietly dominate spend and there’s no clean lever to price, throttle or forecast it. |
| Credits-based | “You get X credits/month to spend on agent work and you can buy more as needed.” | Whether credit price covers costs, how many go unused, how often customers buy overages | Credits fail as a budgeting tool if different workflows consume credits unpredictably and surprises customers |
| Workflow metering | “You pay per workflow type (research, triage, enrichment, etc.).” | What each workflow costs per accepted outcome (CAPO), how often it succeeds and where the expensive outliers come from. | You ship a great meter and a weak value narrative, so procurement treats it as arbitrary fees and pushes for discounts. |
| Outcome-linked | “You pay when the outcome is accepted and delivered.” | Credits fail as a budgeting tool if different workflows consume credits unpredictably and surprise customers | Incentives shift to “passing the gate,” and borderline outcomes create disputes, churn risk and perverse product behavior. |
| Value-based contracts | “We guarantee a business result with predictable unit economics.” | Whether contracted outcomes can be delivered at the target margin, with reliable forecasts | You sign outcome promises without enforcement and operational controls, then deliver more work than you can profitably price. |
A practical 30-60-90 day FinOps plan for agentic SaaS
- 0-30 days: Choose 3-5 high-volume workflows, define explicit acceptance gates and log every run with a unique ID tied to the tenant and workflow so you can trace cost and quality end-to-end.
- 31-60 days: Add routing and validation cascades, cache retrieval and tool outputs and harden tools with schemas, timeouts and idempotency keys.
- 61-90 days: Align pricing with entitlements, set anomaly alerts with an on‑call playbook and review CAPO and tail spend every month.
This article is published as part of the Foundry Expert Contributor Network.
Want to join?
AI makes networking matter again 2 Mar 2026, 1:00 am
For years, one of the cloud’s biggest gifts was that vendors like AWS could take care of the “undifferentiated heavy lifting” of managing infrastructure for you. Need compute? Click a button. Need storage? Click another. Need a database? Let someone else worry about the details. The whole point of managed infrastructure was to save most enterprises from spending their days swimming in low-level systems engineering.
AI is making that abstraction leak.
As I’ve argued, the real enterprise AI challenge is no longer training. It’s inference: applying models continuously to governed enterprise data, under real-world latency, security, and cost constraints. That shift matters because once inference becomes the steady-state workload of the enterprise, infrastructure that once seemed necessary but dull suddenly becomes strategic.
That’s especially true of the network.
The network is… cool?
For decades, networking was prized precisely because it was stable and uneventful. That was the point: No one wants exciting networking. Standards bodies moved slowly and kernel releases moved carefully because predictability was paramount. That conservatism made sense in a world where most enterprise workloads were relatively forgiving and the network’s job was mostly to stay out of the way.
Interestingly, the times when networking became sexy(ish) were times of significant technology upheaval. Think 1999 to 2001, when we had the dot-com bubble/internet infrastructure boom. Then in 2007, we saw broadband and mobile expansion. Later we saw cloud networking consolidation from 2015 to 2022. We’re about to see another big upward shift in networking interest because of AI.
Although observers posting on X still obsess over training runs, model sizes, and huge capital expenditures for data center build-outs to support it all, the real action is arguably elsewhere. For most enterprises, training a model occasionally isn’t the hard part. The harder part is running inference all day, every day, across sensitive data, inside shared environments, with serious performance expectations. Network engineers might prefer to toil away in relative obscurity, but AI makes that impossible. In the AI era, network performance becomes a first-order bottleneck because the application is no longer just waiting on CPU or storage. It’s waiting on the movement of context, tokens, embeddings, model calls, and state across distributed systems.
In other words, AI doesn’t simply increase traffic volume; it changes the nature of what the network does.
A different view of the network
This isn’t the first time we’ve seen a network paradigm shift. As Thomas Graf, CTO at Cisco Security, cofounder of Isovalent, and the creator of Cilium, said in an interview, “The rise of Kubernetes and microservices was the first wave of east-west traffic acceleration. Instead of a single monolith, we broke applications up, and that immediately required security not just at the firewall but east-west inside the infrastructure.”
AI compounds that shift. These workloads aren’t just a few more services talking to one another. They involve synchronized GPU clusters, retrieval pipelines, vector lookups, inference gateways, and, increasingly, agents that continuously exchange state across systems. That’s a different operational world from the one most enterprise networks were built to support. “With AI workloads,” Graf continues, “that’s a hundred times more [data moving around]. Not because things are more broken up, but because AI runs at a scale that is bigger and needs an insane amount of data.”
That “insane amount of data” is why the network matters again and why developers need to think about it again.
In AI environments, the fabric increasingly becomes part of the compute system itself. GPUs exchange gradients, activations, and model state in real time. Packet loss isn’t just an annoyance, as it can stall collective operations and leave expensive hardware idle. Traditional north-south visibility isn’t enough because much of the important traffic never crosses a classic perimeter (e.g., user request to a server). Hence, security policy can’t live only at the edge because the valuable flows are often east-west inside the cluster. And because enterprises are still discovering what their AI demand curves will look like, elasticity matters, too. Networks have to scale incrementally, adapt to mixed workloads, and support evolving architectures without forcing a full redesign every time the AI road map changes.
In other words, AI is making the network less like plumbing and more like part of the application runtime.
Getting serious about Cilium
That’s why eBPF matters. The official eBPF project documentation describes Cilium as a way to safely run sandboxed programs in the kernel, extending kernel capabilities without changing kernel source or loading modules. The technical details are important, but the broader point is simple: eBPF moves observability and enforcement closer to where packets and system calls actually happen. In a world of east-west traffic, ephemeral services, and machine-speed inference, that’s a big deal.
Cilium is one important expression of that shift. It builds on eBPF to provide Kubernetes-native networking, observability, and policy enforcement as fast as the network link itself can carry traffic without becoming a meaningful bottleneck. This is critical to network performance. Unsurprisingly, Cilium has become mainstream table stakes for hyperscalers’ networking stacks. (Google’s GKE Dataplane V2, Microsoft’s Azure CNI Powered by Cilium, and AWS’s EKS Hybrid Nodes all depend on or support Cillium.) Indeed, across the Kubernetes user base, as the 2025 State of Kubernetes Networking Report indicates, a majority use Cilium-based networking.
As important as Cilium is, however, the bigger story is that AI is forcing enterprises to care again about infrastructure details they had happily abstracted away. That doesn’t mean every company should hand-roll its network stack, but it does mean that platform teams can no longer treat networking as an untouchable utility layer. If inference is where enterprise AI becomes real, then latency, telemetry, segmentation, and internal traffic policy are no longer secondary concerns. They’re an essential part of product quality, operational reliability, and developer experience.
More than the network
Nor is this isolated to Cillium, specifically, or networking, generally. AI keeps forcing us to care about things we’d hoped to forget. As I’ve written, it’s fun to fixate on fancy AI demos, but the real work is to make these systems work reliably, securely, and economically in production. Just as important, in our rush to make AI dependable at enterprise scale, we can’t overlook the need to make the whole stack easier to use for developers, easier to govern by IT/ops, and faster under real-world load.
“If an AI-backed service responds faster and behaves more reactively, it will perform better in the market. And the foundation for that is a highly performant, low-latency network without bottlenecks,” notes Graf. “To me, this is very similar to high-frequency trading. Once computers replaced humans, network latency and throughput suddenly became a competitive differentiator.”
That feels right. The winners in enterprise AI won’t simply be the companies with the biggest models. Success comes from making inference reliable, governed, and economical on real data under real load. Some of that battle will be won in models. More of it than many enterprises realize will be won in the supposedly boring layers underneath, like networking.
OpenAI launches stateful AI on AWS, signaling a control plane power shift 27 Feb 2026, 5:37 pm
Stateless AI, in which a model offers one-off answers without context from previous sessions, can be helpful in the short-term but lacking for more complex, multi-step scenarios. To overcome these limitations, OpenAI is introducing what it is calling, naturally, “stateful AI.”
The company has announced that it will soon offer a stateful runtime environment in partnership with Amazon, built to simplify the process of getting AI agents into production. It will run natively on Amazon Bedrock, be tailored for agentic workflows, and optimized for AWS infrastructure.
Interestingly, OpenAI also felt the need to make another announcement today, underscoring the fact that nothing about other collaborations “in any way” changes the terms of its partnership with Microsoft. Azure will remain the exclusive cloud provider of stateless OpenAI APIs.
“It’s a clever structural move,” said Wyatt Mayham of Northwest AI Consulting. “Everyone can claim a win, but the subtext is clear: OpenAI is becoming a multi-cloud company, and the era of exclusive AI partnerships is ending.”
What differentiates ‘stateful’
The stateful runtime environment on Amazon Bedrock was built to execute complex steps that factor in context, OpenAI said. Models can forward memory and history, tool and workflow state, environment use, and identity and permission boundaries.
This represents a new paradigm, according to analysts.
Notably, stateless API calls are a “blank slate,” Mayham explained. “The model doesn’t remember what it just did, what tools it called, or where it is in a multi-step workflow.”
While that’s fine for a chatbot answering one-off questions, it’s “completely inadequate” for real operational work, such as processing a customer claim that moves across five different systems, requires approvals, and takes hours or days to complete, he said.
New stateful capabilities give AI agents a persistent working memory so they can carry context across steps, maintain permissions, and interact with real enterprise tools without developers having to “duct-tape stateless API calls together,” said Mayham.
Further, the Bedrock foundation matters because it’s where many enterprise workloads already live, he noted. OpenAI and Amazon are meeting companies where they are, not asking them to rearchitect their security, governance, and compliance posture.
This makes sophisticated AI automation accessible to mid-market companies; they will no longer need a team of engineers to “build the plumbing from scratch,” he said.
Sanchit Vir Gogia, chief analyst at Greyhound Research, called stateful runtime environments “a control plane shift.” Stateless can be “elegant” for single interactions such as summarization, code assistance, drafting, or isolated tool invocation. But stateful environments give enterprises a “managed orchestration substrate,” he noted.
This supports real enterprise workflows involving chained tool calls, long running processes, human approvals, system identity propagation, retries, exception handling, and audit trails, said Gogia, while Bedrock enforces existing identity and access management (IAM) policies, virtual private cloud (VPC) boundaries, security tooling, logging standards, and compliance frameworks.
“Most pilot failures happen because context resets across calls, permissions are misaligned, tokens expire mid workflow, or an agent cannot resume safely after interruption,” he said. These issues can be avoided in stateful environments.
Factors IT decision-makers should consider
However, there are second order considerations for enterprises, Gogia emphasized. Notably, state persistence increases the attack surface area. This means persistent memory must be encrypted, governed, and auditable, and tool invocation boundaries should be “tightly controlled.” Further, workflow replay mechanisms must be deterministic, and observability granular enough to satisfy regulators.
There is also a “subtle lock in dimension,” said Gogia. Portability can decrease when orchestration moves inside a hyperscaler native runtime. CIOs need to consider whether their future agent architecture remains cloud portable or becomes anchored in AWS’ environment.
Ultimately, this new offering represents a market pivot, he said: The intelligence layer is being commoditized.
“We are moving from a model race to a control plane race,” said Gogia. The strategic question now isn’t about which model is smartest. It is: “Which runtime stack guarantees continuity, auditability, and operational resilience at scale?”
Partnership with Microsoft still ‘strong and central’
Today’s joint announcement from Microsoft and OpenAI about their partnership echoes OpenAI’s similar reaffirmation of the collaboration in October 2025. The partnership remains “strong and central,” and the two companies went so far as to call it “one of the most consequential collaborations in technology,” focused on research, engineering, and product development.
The companies emphasized that:
- Microsoft maintains an exclusive license and access to intellectual property (IP) across OpenAI models and products.
- OpenAI’s Frontier and other first-party products will continue to be hosted on Azure.
- The contractual definition of artificial general intelligence (AGI) and the “process for determining if it has been achieved” is unchanged.
- An ongoing revenue share arrangement will stay the same; this agreement has always included revenue-sharing from partnerships between OpenAI and other cloud providers.
- OpenAI has the flexibility to commit to compute elsewhere, including through infrastructure initiatives like the Stargate project.
- Both companies can independently pursue new opportunities.
“That joint statement reads like it was drafted by three law firms simultaneously, and that’s the point,” says Mayham.
The anchor of the agreement is that Azure remains the exclusive cloud provider of stateless OpenAI APIs. This allows OpenAI to establish a new category on AWS that falls outside of Microsoft’s reach, he said.
OpenAI is ultimately “walking a tightrope,” because it should expand distribution beyond Azure to reach AWS customers, which comprise a massive portion of the enterprise market, he noted. At the same time, they have to ensure Microsoft doesn’t feel like its $135 billion investment “just got diluted in strategic value.”
Gogia called the statement “structural reassurance.” OpenAI must grow distribution across clouds because enterprise buyers are demanding multi-cloud flexibility. They don’t want to be confined to a single cloud; they want architectural optionality.”
Also, he noted, “CIOs and boards do not want vendor instability. Hyperscaler conflict risk is now a board level concern.”
New infusion of funding (again)
Meanwhile, new $110 billion in funding from Nvidia, SoftBank, and Amazon will allow OpenAI to expand its global reach and “deepen” its infrastructure, the company says. Importantly, the funding includes the use of 3GW of dedicated inference capacity and 2 GW of training on Nvidia’s Vera Rubin systems. This builds on the Hopper and Blackwell systems already in operation across Microsoft, Oracle Cloud Infrastructure (OCI), and CoreWeave.
Mayham called this “the headline within the headline.”
“Cash doesn’t build AI products; compute does,” he said. Right now, access to next-generation Nvidia hardware is the “true bottleneck for every AI company on the planet.”
OpenAI is essentially locking in a “guaranteed supply line” for the chips that power everything it does. The money from all three companies funds operations and infrastructure, but the Nvidia capacity and training allows OpenAI to use infrastructure at the frontier, said Mayham. “If you can’t get the processors, the cash is just sitting in a bank account.”
Inference is now one of the biggest cost drivers in AI, and Gogia noted that frontier AI systems are constrained by physical infrastructure; GPUs, high bandwidth memory (HBM), high speed interconnects, and other hardware, as well as grid level power capacity. Are all finite resources.
The current moves embed OpenAI deeper into the infrastructure stack, but the risk is concentration. When compute control centralizes among a small cluster of hyperscalers and chip vendors, the system can become fragile. To protect themselves, Gogia advised enterprises to monitor supply chain concentration.
“In strategic terms, however, this move strengthens OpenAI’s durability,” he said. “It secures the physical substrate required to sustain frontier model scaling and enterprise inference growth.”
Red Hat ships AI platform for hybrid cloud deployments 27 Feb 2026, 2:10 pm
Red Hat has made its Red Hat AI Enterprise platform generally available, with the intent to provide an AI platform to simplify development and deployment of hybrid cloud-based applications powered by AI.
Availability of the platform was announced February 24. Engineered to solve the “production gap” for AI, Red Hat AI Enterprise unifies AI model and application life cycles—from model development and tuning to high-performance inference—on a standard, centralized infrastructure to accelerate delivery, increase operational efficiency, and mitigate risk by providing a comprehensive, all-in-one experience, Red Hat said. Users can move away from treating AI as a disjointed, bespoke effort and transform it into a scalable, repeatable factory process. Red Hat AI Enterprise is powered by the Red Hat OpenShift cloud application platform.
Red Hat cited the following business benefits of Red Hat AI Enterprise:
- Accelerated time-to-value, with a ready-to-use environment for teams to “develop once and deploy anywhere” without rewriting code.
- Increased operational efficiency, simplifying workflows from code commits to model serving.
- Mitigated risk and governance, with a foundation for digital sovereignty, giving organizations control over where data and models reside.
For platform engineers, AI engineers, and application developers, Red Hat AI Enterprise provides a foundation for modern AI workloads, Red Hat said. This includes AI life-cycle management, high-performance inference at scale, agentic AI innovation, integrated observability and performance modeling, and trustworthy AI and continuous evaluation. Tools are provided for dynamic resource scaling, monitoring, and security. For zero-downtime maintenance, rolling platform updates keep the AI stack current and protected without disrupting active inference services, according to Red Hat.
‘Silent’ Google API key change exposed Gemini AI data 27 Feb 2026, 12:47 pm
Google Cloud API keys, normally used as simple billing identifiers for APIs such as Maps or YouTube, could be scraped from websites to give access to private Gemini AI project data, researchers from Truffle Security recently discovered.
According to a Common Crawl scan of websites carried out by the company in November, there were 2,863 live Google API keys that left organizations exposed. This included “major financial institutions, security companies, global recruiting firms, and, notably, Google itself,” Truffle Security said.
The alarming security weakness was caused by a silent change in the status of Google Cloud Platform (GCP) API keys which the company neglected to tell developers about.
For more than a decade, Google’s developer documentation has described these keys, identified by the prefix ‘Aiza’, as a mechanism used to identify a project for billing purposes. Developers generated a key and then pasted it into their client-side HTML code in full public view.
However, with the appearance of the Gemini API (Generative Language API) from late 2023 onwards, it seems that these keys also started acting as authentication keys for sites embedding the Gemini AI Assistant.
No warning
Developers might build a site with basic features such as an embedded Maps function whose usage was identified for metering purposes using the original public GCP API key. When they later added Gemini to the same project, to, for example, make available a chatbot or other interactive feature, the same key effectively authenticated access to anything the owner had stored through the Gemini API, including datasets, documents and cached context. Because this is AI, extracting data would be as simple as prompting Gemini to reveal it.
That same access could also be exploited to consume tokens through the API, potentially generating large bills for the owners or exhausting their quotas, said Truffle Security. All an attacker would need to do is view a site’s source code and extract the key.
“Your public Maps key is now a Gemini credential. Anyone who scrapes it can access your uploaded files, cached content, and rack up your AI bill,” the researchers pointed out. “Nobody told you.”
API key exploitation is more than hypothetical. In a different context, a student who reportedly exposed a GCP API key on GitHub last June was left nursing a $55,444 bill (later waived by Google) after it was extracted and re-used by others.
Truffle Security said it disclosed the issue with the keys to Google in November, and the company eventually admitted the issue was a bona fide bug. After being told of the 2,863 exposed keys, the company restricted them from accessing the Gemini API.
On February 19, the 90-day bug disclosure window closed, with Google apparently still working on a more comprehensive fix.
“The initial triage was frustrating; the report was dismissed as ‘Intended Behavior.’ But after providing concrete evidence from Google’s own infrastructure, the GCP VDP team took the issue seriously,” said Truffle Security. “Building software at Google’s scale is extraordinarily difficult, and the Gemini API inherited a key management architecture built for a different era.”
Mitigation
The first job for concerned site admins is to check in the GCP console for keys specifically allowing the Generative Language API. In addition, look for unrestricted keys, now identified by a yellow warning icon. Check if any of these keys are public.
Exposed keys should all be rotated or ‘regenerated,’ with a grace period that considers the effect this will have on downstream apps that have cached the old one.
This vulnerability underlines how small cloud evolution oversights can have wider, unforeseen consequences. Truffle Security noted that Google now says in its roadmap that it is taking steps to remedy the API key problem: API keys created through AI Studio will default to Gemini-only access, and the company will also block leaked keys, notifying customers when they detect this to have happened.
“We’d love to see Google go further and retroactively audit existing impacted keys and notify project owners who may be unknowingly exposed, but honestly, that is a monumental task,” Truffle Security admitted.
This article originally appeared on CSOonline.
The reliability cost of default timeouts 27 Feb 2026, 2:00 am
In user-facing distributed systems, latency is often a stronger signal of failure than errors. When responses exceed user expectations, the distinction between “slow” and “down” becomes largely irrelevant, even if every service is technically healthy.
I’ve seen this pattern across multiple systems. One incident, in particular, forced me to confront how much production behavior is shaped by defaults we never explicitly choose. What stood out was not the slowness itself, but how “infinite by default” waiting quietly drained capacity long before anything crossed a traditional failure threshold.
Details are generalized to avoid sharing proprietary information.
When slowness turned into an outage
The incident started with support tickets, not alarms. Early in the morning, they began to appear:
- Product pages don’t load.
- Checkout is stuck.
- The site is slow today.
At the same time, our dashboards drifted in subtle ways. CPU climbed, memory pressure increased and thread pools filled while error rates stayed low. Product pages began hanging intermittently: some requests completed, others stalled long enough that users refreshed, opened new tabs and eventually left.
I was on call that week. There had been a recent deployment, so I rolled it back early. It had no effect, which told us the issue wasn’t a specific change, but how the system behaved under sustained slowness.
Within a few hours, the impact was measurable. Product page abandonment increased sharply. Conversion dropped by double digits. Support ticket volume spiked. Users started switching to competitors. By the end of the day, the incident resulted in a six-figure loss and, more importantly, a visible loss of user trust.
The harder question wasn’t what failed, but why user impact appeared before our pages fired. The system crossed the user’s pain threshold long before it crossed any paging threshold. Our alerts were optimized for hard failures – errors, instance health, explicit saturation – while latency lived on dashboards rather than in paging.
The failure mode we missed
Product pages displayed prices in the user’s local currency. To do that, the Product Service called a downstream currency exchange API. That dependency did not go down. It became slow, intermittently, for long enough to trigger a cascade.
As I dug deeper during the incident, one detail stood out. The Product Service used an HTTP client with default configuration, where the request timeout was effectively infinite. On the frontend, browsers stopped waiting after roughly 30 seconds. On the backend, requests continued to wait long after the user had already given up.

Violetta Pidvolotska
That gap mattered more than I expected. The first few hung currency calls held onto Product Service worker threads and outbound connections, so new requests began queuing behind work that no longer had a user on the other end. Once the shared pools started to saturate, it stopped being “only the currency path.” Even requests that didn’t require currency conversion slowed down because they waited for the same thread pool and the same internal capacity.
At that point, the dependency didn’t need to fail to take the service down. It only needed to become slow while we kept waiting without a boundary. This wasn’t an error failure. It was a capacity failure. Blocked concurrency accumulated faster than it could drain, latency propagated outward and throughput collapsed without a single exception being thrown.
Some mitigations helped only temporarily. Restarting instances or shedding traffic reduced pressure for a short time, but the relief never lasted. As long as requests were allowed to wait indefinitely, the system kept accumulating work faster than it could complete it.
When we finally pinpointed the unbounded wait, the immediate fix sounded simple: set a timeout. The real lesson was deeper.
Defaults that quietly shape system behavior
At first glance, this looked like a simple misconfiguration. In reality, it reflected how common default settings influence system behavior in production.
Many widely used libraries and systems default to infinite or extremely large timeouts. In Java, common HTTP clients treat a timeout of zero as “wait indefinitely” unless explicitly configured. In Python, requests will wait indefinitely unless a timeout is set explicitly. The Fetch API does not define a built-in timeout at all.
These defaults aren’t careless. They’re intentionally generic. Libraries optimize for the correctness of a single request because they can’t know what “too slow” means for your system. Survivability under partial failure is left to the application.
Production systems rarely fail under ideal conditions. They fail under load, partial outages, retries and real user behavior. In those conditions, unbounded waiting becomes dangerous. Defaults that feel harmless during development quietly make architectural decisions in production.
When we later audited our services as a team, we found that many calls either had no timeouts or had values that no longer matched real production latency. The defaults had been shaping system behavior for years, without us explicitly choosing them.
The mental model behind long timeouts
What this incident revealed wasn’t just a missing timeout. It exposed a mental model many teams rely on, including ours at the time.
That model assumes:
- Dependencies are usually fast
- Slowness is rare
- Defaults are reasonable
- Waiting longer increases the chance of success
It prioritizes individual request success, often at the cost of overall system reliability. As a result, teams often don’t know their effective timeouts, different services use inconsistent values and some calls have no timeouts at all.

Violetta Pidvolotska
Even when timeouts exist, they are often far longer than what user behavior justifies. In our case, users retried within a few seconds and abandoned within about ten. Waiting beyond that didn’t improve outcomes. It only consumed capacity.
Long timeouts can also mask deeper design problems. If a request regularly times out because it returns thousands of items, the issue isn’t the timeout itself. It’s missing pagination or poor request shaping. By optimizing for individual request success, teams unintentionally trade away system-level resilience.
Timeouts as failure boundaries
Before this incident, we mostly treated timeouts as configuration knobs. After that, we started treating them as failure boundaries.
A timeout defines where a failure is allowed to stop. Without timeouts, a single slow dependency can quietly consume threads, connections and memory across the system. With well-chosen timeouts, slowness stays contained instead of spreading into a system-wide failure.
We made a set of deliberate changes:
1. Enforced timeouts on the client side
The caller decides when to stop waiting. Load balancers, proxies or servers could not reliably protect us from hanging forever, as the incident made clear.
2. Introduced explicit end-to-end deadlines for user-facing flows
Downstream calls could only use the remaining time budget; waiting beyond that point was wasted work with no chance of improving the outcome.

Violetta Pidvolotska
We made those deadlines explicit and portable. In HTTP flows, we propagated an end-to-end deadline via a single X-Request-Deadline header so each service could compute the remaining time and set per-call timeouts accordingly. We chose a deadline (not a per-hop timeout) because it composes cleanly across service boundaries and retries.
For gRPC paths, built-in deadlines allowed remaining time to propagate across service boundaries. We extended that same boundary through internal request context so background work stopped when the budget did.
3. Became deliberate about how timeout values were chosen
Connection timeouts were kept short and tied to network behavior. Request timeouts were based on real production latency, not intuition.
Rather than relying on averages, we focused on p99 and p99.9. When p50 was close to p99, we left room so minor slowdowns didn’t amplify into timeout spikes. This helped us understand how slow requests behaved under load and choose timeouts that protected capacity without causing unnecessary failures.
For example, if 99% of requests completed in 300 milliseconds, a timeout of 350-400 milliseconds provided a better balance than tens of seconds. What happened beyond that point became a conscious product decision. In our case, when currency conversion timed out, we fell back to showing prices in the primary currency. Users consistently preferred an imperfect answer over waiting indefinitely.
We also kept retries conservative in user-facing paths. A retry that doesn’t respect an end-to-end deadline is worse than no retry: it multiplies work after the user has already moved on. That’s how “helpful” retries turn into retry storms under partial slowness.
As a team, we codified these decisions into shared client defaults and a mandatory review checklist used across new and existing call paths so unbounded waiting didn’t quietly return.
Keeping timeouts honest
Timeouts should never be silent. After the incident, we focused on three things:
1. Making timeouts observable
Every timeout emitted a structured log entry with dependency context and remaining time budget. We tracked timeout rates as metrics and alerted on sustained increases rather than individual spikes. Rising timeout rates became an early warning signal instead of a surprise during incidents. Importantly, we updated paging to include user-impacting latency and “requests not finishing” signals, not just error rate.
2. Stopping the treatment of timeout values as constants
Traffic grows, dependencies change and architectures evolve, so values that were reasonable a year ago are often wrong today. We reviewed timeout configuration whenever traffic patterns shifted, new dependencies were introduced or latency distributions changed.
3. Validating timeout behavior before real incidents forced the issue
Introducing artificial latency in non-production environments quickly exposed hanging calls, retry amplification and missing fallbacks. It also forced us to separate two different questions: what breaks under load and what breaks under slowness.
Traditional load tests answered the first. Fault-injection and latency experiments revealed the second, a form of controlled failure often described as chaos engineering. By introducing controlled delay and occasional hangs, we verified that deadlines actually stopped work, queues didn’t grow without bound and fallbacks behaved as intended.
Lessons that carried forward
This incident permanently changed how I think about timeouts.
A timeout is a decision about value. Past a certain point, waiting longer does not improve user experience. It increases the amount of wasted work a system performs after the user has already left.
A timeout is also a decision about containment. Without bounded waits, partial failures turn into system-wide failures through resource exhaustion: blocked threads, saturated pools, growing queues and cascading latency.
If there is one takeaway from this story, it is this: define timeouts deliberately and tie them to budgets. Start from user behavior. Measure latency at p99, not just averages. Make timeouts observable and decide explicitly what happens when they fire. Isolate capacity so that a single slow dependency cannot drain the system.
Unbounded waiting is not neutral. It has a real reliability cost. If you do not bound waiting deliberately, it will eventually bound your system for you.
This article is published as part of the Foundry Expert Contributor Network.
Want to join?
Cloud sovereignty isn’t a toggle feature 27 Feb 2026, 1:00 am
Sovereignty, locality, and “alternative cloud” strategies are often treated as simple settings in hyperscaler consoles. Pick a region, check a compliance box, and move on. IT consultancy Coinerella posted about replacing a typical US-centric startup baseline with a “Made in the EU” stack. They treat sovereignty as an architectural posture and an operating model that can save money. It still involves friction, compromise, and more responsibility than outsourcing to default ecosystems.
The Coinerella approach is to deliberately refuse to let the platform drift toward AWS and US-based hyperscalers, driven by practical considerations such as data residency, General Data Protection Regulation (GDPR) compliance, reducing concentration risk, and demonstrating the operational viability of European infrastructure. Leaders often talk about sovereignty until the first production incident, the first compliance review, or the first integration gap. Coinerella remains committed and is addressing the consequences.
A ‘made in the EU’ stack
Coinerella didn’t pursue sovereignty by inventing new patterns. They recreated a fairly standard modern platform using European providers and selectively self-hosted services. For core infrastructure, they moved primary compute and foundational services to Hetzner, including virtual machines, load balancing, and S3-compatible object storage. This is where the story gets interesting: The hyperscaler narrative suggests that leaving AWS is mostly about giving up features. Coinerella found something different, at least for the basics. Compared with what many teams experience on AWS, their new performance and capability were solid, and the cost profile was compelling.
When Hetzner didn’t provide a managed service they needed, they filled in the gaps with Scaleway. That included transactional email, a container registry, additional object storage, observability tools, and even domain registration. In many migrations, stitching together multiple providers is where complexity balloons; here, the company intentionally used that approach, choosing the best option available in the region rather than forcing a single vendor to do everything.
At the edge, they relied on Bunny.net for the content delivery network and related capabilities, including storage, DNS, image optimization, web application firewall, and DDoS protection. That choice is a reminder that edge services are not just an add-on; they are a major part of the platform’s reliability and security posture. Their blog suggests the experience felt approachable, coming from the more common Cloudflare-centric world, which is exactly what you want when you’re reducing risk in a migration.
Coinerella also addressed AI inference in a sovereignty-aware way by using European GPU capacity via Nebius rather than defaulting to US regions for inference calls. For identity, they used Hanko, a European authentication provider that supports modern authentication approaches like passkeys and handles common log-in expectations such as social log-ins.
Finally, and importantly, they self-hosted a meaningful set of internal services on Kubernetes, using Rancher as the management layer. They ran Gitea for source control, Plausible for analytics, Twenty for CRM, Infisical for secrets management, and Bugsink for error tracking. If you’ve ever advised an enterprise to self-host “just a few things,” you know what this really means: You’re accepting a different operational contract, where savings and control come with life-cycle ownership.
Surprises and extra hurdles
Coinerella’s post is most valuable where they write about difficulties in the “boring” services that often make or break developer productivity. Email was one of the major friction points. In the US ecosystem, transactional email options are plentiful, polished, and easy to integrate, with a deep bench of community guidance for deliverability and troubleshooting. Coinerella made it work with a European alternative, but the takeaway is clear: The long tail of integrations, templates, and community answers isn’t evenly distributed across regions. It’s not that the service can’t function; it’s that you may have to serve as your own integration team more often.
Source control was another challenge. Moving away from GitHub isn’t just about moving away from a Git remote; it’s about leaving an ecosystem: CI/CD defaults, actions, marketplace integrations, and the operational muscle memory of every developer who has internalized the GitHub way of doing things. Gitea can be a solid foundation, but it doesn’t automatically bring the full assembly line you get “for free” on the dominant platform.
There were also cost anomalies. The author notes that some top-level domains appeared to be significantly more expensive through European registrars—sometimes dramatically so—without a satisfactory explanation why. That’s not an architectural deal-breaker, but it’s exactly the kind of real-world detail that proves a point: These journeys aren’t clean-room exercises. You’ll encounter unexpected differences in market structure, and you’ll have to decide how much they matter.
Unavoidable dependencies
If you’re looking for a purity narrative that claims “we removed every US dependency,” this isn’t it. Coinerella acknowledged that some dependencies are structural. User acquisition may require Google’s advertising ecosystem, and mobile distribution routes may have to go through Apple’s developer program. Social log-ins often rely on Google and Apple infrastructure, and removing them can harm conversion rates. Even AI introduces pressure: If you want access to specific frontier models, you may be forced to use US-based APIs.
The smarter posture this blog implicitly recommends is to minimize what you can, isolate what you can’t, and be honest about the trade-offs. Sovereignty isn’t binary. It’s a spectrum of choices about where your core data and operational dependencies reside.
Moving to an alt cloud
Coinerella’s experience mirrors what many enterprises are learning as they move toward alt clouds, including sovereign clouds, private clouds, and other non-default platforms. The biggest lesson is that the economics of the move can be attractive precisely because you’re taking on more work. Lower infrastructure costs are real, but they come with increased integration responsibility, more platform engineering, and a higher need for operational maturity.
This is also where the “want versus need” conversation becomes unavoidable. Hyperscalers have trained teams to select managed services the way you pick items off a menu, often because it’s convenient, fast, and politically easy. Alt cloud strategies force prioritization. You may want the newest managed feature set, the deepest marketplace, and the broadest ecosystem, but you may not need them to meet your business outcomes. When you choose sovereignty or a private-cloud footing, you often end up selecting simpler technologies that meet requirements, even if they’re less glamorous or less feature-rich. This is not a retreat. It’s a form of architectural discipline.
However, none of this works without adding new practices. Finops becomes an engineering discipline that spans heterogeneous providers, self-hosted platforms, and capacity planning decisions you can no longer punt to a hyperscaler. Observability becomes a first-class design requirement because you’re building a platform that crosses boundaries and includes components you own end to end. You need consistent metrics, logs, traces, service-level objectives, and incident response procedures that work even when tools and APIs differ across providers. Because you’re doing more of the work, you need to be more explicit about patching, security, backups, recovery testing, and operational runbooks.
The point isn’t that this is too hard to do. The point is that it’s hard in predictable ways. Coinerella’s blog makes the case that the journey is worth the trouble, but it’s not easy—and that’s the framing enterprise leaders need. If you expect sovereignty to be a product feature, you’ll be disappointed. If you treat it as a strategic posture that comes with real engineering commitments, you can get the control, cost profile, and locality benefits you’re looking for without being surprised by the work required.
Google’s Android developer verification program draws pushback 26 Feb 2026, 5:06 pm
Google’s planned Android developer verification program, requiring Android apps to be registered by verified developers, is getting pushback, with opponents urging developers not to sign up for the program and to make their opposition known.
An open letter opposing the verification program was posted February 24 at Keep Android Open, a consortium that is fighting the Google verification program. Among the 41 signatories as of February 26 are the Electronic Frontier Foundation, the Free Software Foundation, the Center for Digital Progress, and the Software Freedom Conservancy. “Android, currently an open platform where anyone can develop and distribute applications freely, is to become a locked-down platform, requiring that developers everywhere register centrally with Google in order to be able to distribute their software,” said Marc Prud’hommeaux of the F-Droid Android development community in a blog post.
Google could not be reached for comment on February 26. The program was announced August 25, 2026. Starting in September, Android will require all apps to be registered by verified developers before they can be installed on certified Android devices. “To better protect users from repeat bad actors spreading malware and scams, we’re adding another layer of security to make installing apps safer for everyone: developer verification,” said Suzanne Frey, Google’s VP, Product, Trust and Growth for Android, in the blog post announcing the program. “This creates crucial accountability, making it much harder for malicious actors to quickly distribute another harmful app after we take the first one down” she said.
Keep Android Open disagrees. In the open letter, the organization calls upon Google to:
- Immediately rescind the mandatory developer registration requirement for third-party distribution.
- Engage in transparent dialogue with civil society, developers, and regulators about Android security improvements that respect openness and competition.
- Commit to platform neutrality by ensuring that Android remains a genuinely open platform where Google’s role as platform provider does not conflict with its commercial interests.
Keep Android Open wants developers to resist by refusing to sign up for early access, refusing to perform early verification, and refusing to accept an invitation to the Android Developer Console. Instead, the group advises, developers should respond to the invitation with a list of concerns and objections. It encourages consumers to contact national regulators and express concerns.
Lightrun unveils AI SRE to find and fix software production errors 26 Feb 2026, 12:59 pm
Lightrun has announced Lightrun AI SRE, an AI-powered site reliability engineering (SRE) assistant designed to detect software production errors and performance degradations.
Introduced February 25, the Lightrun AI SRE correlates the service-level issues it finds with proven root causes to propose solutions. Drawing on on live, in-line runtime context, the AI SRE allows AI agents and engineering teams to create missing evidence dynamically, prove root causes with live execution data, and validate fixes directly in live environments, Lightrun said.
The company cited the folllowing key capabilities and benefits of AI SRE:
- Performs root cause analysis based on new evidence from live environments, without needing prior instrumentation.
- Suggests runtime-validated code changes to eliminate guesswork and reduce rollback-and-redeploy cycles.
- Performs live issue debugging in safe remote sessions with execution-level behavior inspections.
- Provides dynamic telemetry to running systems to fill visibility gaps that traditional, observability tools cannot address.
- Reduces reliance on expensive war rooms, due to autonomous remediation and the ability to receive a code fix of incidents before escalating to a human.
- Provides resilience to “unknown unknowns” introduced by multiple AI agents across the SDLC.
The Lightrun AI SRE safely interacts with live systems via Lightrun’s Sandbox to create new evidence, test hypotheses, and validate outcomes against real execution behavior, Lightrun said. This capability transforms AI SRE from a reactive, post-incident advisor into a trusted, runtime-verified autonomous engineer that ensures reliability by design, according to the company.
The browser is your database: Local-first comes of age 26 Feb 2026, 1:00 am
Once upon a time, we had mainframes with simple, nonprogrammable consoles. All the power was centralized. Then, Gates and Jobs put a personal computer on every desk. The power was distributed. Then, the internet came along, and the browser became the most popular application in the world. The power moved back onto the server, the cloud became king. Now, that pendulum is swinging back.
This article is a first look at the local-first movement, and the new technologies embedding feature-rich data storage options directly into web browsers.
PGLite: The database in your browser
The modern browser is a beast, the result of years of intensive development and real-world testing. Today’s browser typically runs on a very capable machine. Yet, like a pauper, it must ask the server every time it wants some data. Browser state is just a temporary shadow, eradicated every time the screen refreshes. The loading spinner, the UI waterfall, the click-and-wait; these are all effects of our ongoing dependency on the back end for persistent state.
But an alternative is emerging. The idea is to embed a relational database directly in the browser, with a slice of the data, and let a synchronization (sync) engine keep everything consistent. The browser interacts with a local datastore that is synced to the server in the background. This means instant interactivity on the front end while maintaining symmetry with the back end. This next-generation browser has a more resilient state-of-record, not just a temporary cache.
Several factors have emerged to make the browser a more robust datastore, including IndexedDB and WebAssembly, paving the way for tools like the in-browser NoSQL datastore, PouchDB. But probably the star of the show these days is the PGLite SQL database.
Of course, everything comes with tradeoffs. There are architectural implications to moving the database into the browser. But the most significant change is the gradual distancing from two of the bedrocks of web development: JSON and REST.
The isomorphic future
I recently wrote about how WinterTC moves us toward the dream of isomorphic JavaScript, where the server and client are exactly the same. The next stage is achieving similar homogeneity across datastores. That has only recently become possible with the maturation of WASM, which runs a fully-featured PostgresSQL instance in the browser. That instance, the WASM database, is PGLite.
Although SQLite can get you close to an enterprise database in the browser, PGLite is literally the same database you would run in the data center. It eliminates the friction of dialect. The WASM runtime (really a wonder of the modern programming world) makes PGLite a lightweight build of the actual Postgres codebase.
All of this means we are nearer than ever to a thick client. Of course, there are nuances to be considered, and there will be twists and turns along the path.
Shape-based syncing
Even if the API and implementation are the same, we can’t just create a shard of the entire database in the browser. It’s too big, and it would be insecure anyway. We only want the data the specific user needs for a given session.
An influential idea is “shape-based” syncing. This was popularized by ElectricSQL, which is also the force behind PGLite. A shape is something like a view. It takes one or more queries and uses them to populate the client-side database with a segment of the relevant data. Only the server holds the full truth. The client subscribes to a specific shape within the server (e.g., SELECT * FROM issues WHERE assigned_to = 'me').
Under the hood, syncing relies on Postgres’s native Logical Replication protocol. The sync engine is a middleware consumer. It listens to the database’s write-ahead log (the real-time stream of all changes happening on the server). When a change occurs that matches a client’s subscribed shape, the engine pushes that specific update down the background WebSocket to the browser’s PGLite instance. This activity is bidirectional. Local writes are applied instantly to the UI, then queued and streamed back upstream to the central database, while the engine handles the necessary conflict-resolution logic.
In the old days of progressive web apps (PWAs), you had to write imperative code to replay failed requests when the user came back online. The technique worked but it was brittle, and not a great developer experience. Modern sync engines offer a more elegant solution by doing the work themselves.
At this point, you might be thinking: “But the browser is ephemeral! Users clear their cache!”
The sync engine address this natural objection. To understand how, think about the architecture of Git:
- The remote database (GitHub) is the source of truth.
- The local database (your laptop) contains the working data.
If a user clears their browser cache, they haven’t lost their data. They’ve simply deleted their local repository. When they log in again, the sync engine essentially performs a git clone, pulling their “shape of data” back down to the device.
Conflict-free replicated data types
But what happens if two users edit the exact same data while offline? In a standard database, the last write would overwrite the previous one. The syncing logic needs to be very sophisticated indeed, to handle a multitude of clients operating on shapes and continually dumping their syncs to the central datastore.
This is where CRDTs (conflict-free replicated data types) come in.
CRDTs are an esoteric sounding set of mathematical constructs with practical applications for the syncing problem. These data structures (like a Map or a List) are designed to be merged mathematically. It’s the difference between a Git merge conflict (which stops work and requires human intervention) and Google Docs (which merges everyone’s typing automatically). By using CRDT logic, sync engines ensure users’ offline edits are never lost; instead, they are seamlessly combined when the connection is restored.
Now, let me preempt what you might be thinking next: This is a lot of additional architecture. I mean, now we have two databases, a syncing engine, and that “shape” appears to be a duplication of a SELECT statement that should live on the server.
We have in a sense taken distributed computing, a classic hard problem, and split it at the datastore. We are doing this to avoid the clunkiness of loading data, but at the expense of a known pattern: The JSON API (and REST).
If we are willing to do all this, we must be hoping to gain something of great value from it, right?
Outgrowing the JSON API
This new approach makes something possible that web developers have been chasing for 20 years: the desktop-class experience.
By interacting with local data, a responsiveness is attainable on the UI that is simply out of reach when relying on direct network calls. Pulling in a full-blown PostgreSQL instance is a thoroughgoing solution that avoids half-measures like local caches. On top of that, it produces another interesting potential win in developer experience.
By eliminating the back-end API, we toss out a whole layer of coupling that web developers have traditionally had to deal with. The goal is to somehow avoid manually translating client-side data into a transport format, then back into a datastore format, and then back again. (Frameworks like HTMX are also working toward this ideal, although they approach it differently.)
In the ideal of local-first data, we don’t have to do any of that marshaling of JSON. We just write the SQL statement that we want, and the sync engine automatically handles the transport (based on the rules we’ve defined). We no longer write a GET /todos endpoint. Instead, we write an SQL query in our component: SELECT * FROM todos.
IndexedDB and OPFS
While PGLite is fascinating technology, it isn’t the only story on the local-first data front. It’s actually part of a larger constellation of technologies. After all, developers have always found places to stash data: namely, localStorage, cookies, and IndexedDB.
IndexedDB is a real attempt to give the browser a database (and indeed, it can be used as a quick way to back a PGLite instance) but it’s hamstrung by a notoriously clunky API and performance limitations. It is more like a file system bucket than a database engine. It offers no support for complex queries, joins, or constraints. To do anything interesting, you have to write the database logic yourself in JavaScript, pulling data into memory to filter it, which destroys performance. All of which is to say, it’s messy for real-world use cases.
IndexedDB was a necessary stepping-stone, but it’s not a final destination. The foundation of the modern era was built by WebAssembly and the Origin Private File System (OPFS). These are the technologies that let us stop re-inventing databases in JavaScript and start porting proven engines directly into the client.
The high-speed file system: OPFS
While it sounds like an obscure browser spec, OPFS is important to modern local-first architecture. Where WASM provides the runtime, OPFS provides the file system.
OPFS finally gives the browser direct, high-performance access to the user’s hard drive. Unlike IndexedDB, which forces us to read/write entire files or objects, OPFS allows for random-access writes. This means a database like PGLite can modify a tiny 4KB page of data in the middle of a 1GB file without rewriting the entire thing. This is the missing link that allows server-grade databases to run in the browser with near-native performance.
The NoSQL alternative: RxDB
If PGLite is the champion of the “SQL on the client” movement, RxDB (Reactive Database) is the NoSQL equivalent. It inherits from PouchDB, an in-browser NoSQL database that has been in real-world use for many years.
While PGLite focuses on bringing the structure of the server to the client, RxDB focuses on the behavior of the modern UI. It is designed around reactivity (hence the Rx prefix). In a standard database, you run a query and get a result. In RxDB, you subscribe to a query:
// In RxDB, the database IS the state manager
db.todos.find().$.subscribe(todos => {
render(todos);
});
When the sync engine pushes new data from the server, your UI updates instantly. You don’t need a state management library like Redux or Pinia because the database itself is the source of reactive truth.
Conclusion
The browser is no longer just a document viewer or simple terminal interface. But the emerging local-first architecture is a radical departure from our familiar REST and REST-like solutions. It brings its own complexities.
With the unification of the runtime (WinterTC) and the arrival of industrial-grade local databases (PGlite and RxDB), the browser could potentially be a full-bore application platform. But will it supplant the known way of doing things? Not all at once, and not quickly. Familiarity is a huge ballast in the programming world.
Local-first + syncing could someday knock the crown off JSON and REST. But first, it will have to prove its viability in the real world.
The best new features of C# 14 26 Feb 2026, 1:00 am
Available as a part of .NET 10, which was released last November, C# 14 brings a plethora of new features and enhancements that make it easier to write efficient, high performant code. Just as we walked through the new features and enhancements in C# 13 and C# 12, in this article we’ll take a close look at some of the best new features in C# 14.
To work with the code examples provided in this article, you should have Visual Studio 2026 or a later version installed in your system. If you don’t already have a copy, you can download Visual Studio 2026 here.
File-based apps
Support for file-based apps is perhaps the most striking new feature in this release of the C# programming language. Until C# 14, we’ve had to follow a multi-step process to run a minimal .cs file. Not only was this a multi-step process, but it incurred significant additional overhead because you had to create a solution file and a project file to run your application. Even if all you wanted to do was perform a quick calculation or process a piece of data quickly to test your code, you had to create additional files you may not need later. No longer.
With C# 14, now you can run a C# file directly from the command line without needing a project or solution file.
Let us understand this with a code example. Consider a file named Demo.cs that contains the following code.
Console.WriteLine("This is a sample text");
DateTime dateTime = DateTime.UtcNow.Date;
Console.WriteLine($"Today's date is: {dateTime.ToString("d")}");
You can execute the program using the following command at the console window.
dotnet run Demo.cs
When the program is executed, you’ll see the following text displayed at the console.

Foundry
Note that you can create file-based apps that reference NuGet packages and SDKs using preprocessor directives, without needing a project or solution file.
Extension members
Extension members are a new feature in C# 14 that let you declare extension properties as well as extension methods. In addition, extension members make it easier to declare extension methods than in previous versions of C#. Before we dive into extension members, let’s first understand extension methods.
In the C# programming language, extension methods are a feature that permits you to augment the capabilities of classes without the necessity of inheritance. You do not need to create subclasses to use extension methods, nor is it necessary to modify or recompile existing class definitions. In addition to improving code readability, extension methods help you add new methods to your existing types (i.e., classes, structs, records, or interfaces). Incidentally, extension methods were first implemented in C# 3.0.
There are numerous extension methods in .NET that allow you to expand the querying capabilities of both System.Collections.IEnumerable and System.Collections.Generic.IEnumerable by using the LINQ standard query operator. While you can take advantage of extension methods to extend a class or an interface in C#, you cannot override their methods. Extension methods can help you to extend the functionality of types even if they are sealed, such as the String class in C#.
For example, the where() extension method is defined in the Enumerable static class pertaining to the System.Linq namespace. The following code snippet creates an instance of the where() extension method:
public static IEnumerable Where(
this IEnumerable source,
Func predicate)
Note the use of the this keyword. Prior to C# 14, to implement an extension method, you had to create a static method and pass the this reference as a parameter to the method. In C# 14, the code snippet above can be replaced using an extension block, without the need of specifying the this parameter. This is shown in the code snippet given below.
extension(IEnumerable source)
{
public IEnumerable
Where(Func predicate)
}
The ability to define extension members has other advantages as well. Note that an extension member requires two types of information, i.e., the receiver to which the member should be applied and any parameters it might need if the member is a method. With the new extension member syntax, you can define an extension block and then write the receivers as needed. Most importantly, this new syntax enables you to define a receiver for your extension member that doesn’t require any parameter, i.e., if you’re using an extension property.
Additionally, by using the new syntax, you can logically group extensions that apply to the same receiver. You can then define a new extension block if the receiver changes. Moreover, the static class in which you write your extension blocks or extension methods (if you’re using an earlier version of the C# language) can contain both the extension methods that require the this parameter and the extension members grouped inside extension blocks, as shown in the C# 14 code listing given below.
public static class StringExtensions
{
extension(string value)
{
public bool ContainsAnyDigit()
{
if (string.IsNullOrEmpty(value))
return false;
return value.Any(char.IsDigit);
}
public bool ContainsAnySpecialCharacter()
{
if (string.IsNullOrEmpty(value))
return false;
return value.Any(c => !char.IsLetterOrDigit(c));
}
}
public static bool IsNullOrEmptyOrWhiteSpace(this string str)
{
return string.IsNullOrWhiteSpace(str);
}
}
In the preceding code snippet, the extension method IsNullOrEmptyOrWhiteSpace uses the legacy syntax (i.e., it requires the this parameter), whereas the extension methods ContainsAnyDigit and ContainsAnySpecialCharacter use the new syntax.
You can read more about extension members in C# 14 here.
Improvements to the nameof operator for unbound generics
C# 14 brings improvements to the nameof keyword by supporting unbound generic types (e.g., List, Dictionary). Now that nameof can take an unbound generic type as an argument, you no longer need to define dummy type arguments (such as List) merely to obtain the type name “List.”
Let us understand this with a code example. In the following piece of code, you’ll need to specify the type argument for the cast to work perfectly.
string typeNameList = nameof(List);
string typeNameDictionary = nameof(Dictionary);
With C# 14, unbound generics work directly. You no longer need to specify the type explicitly, as shown in the code snippet given below.
string typeNameList = nameof(List);
string typeNameDictionary = nameof(Dictionary);
Hence, with C# 14, the following lines of code will work perfectly.
Console.WriteLine(nameof(List));
Console.WriteLine(nameof(Dictionary));
User-defined compound assignment operators
C# 14 comes with support for compound assignment operators. This feature enables you to write code similar to x += y instead of having to write x = x + y, as you do in the previous versions of the language. You can use compound assignment operators in C# 14 to overload +=, -=, *=, /=, %=, &=, |=, ^=, , and >>= operators.
Consider the following code snippet that creates a ShoppingCart class in which the += operator is overloaded.
public class ShoppingCart
{
public int TotalQuantity { get; private set; } = 0;
public decimal TotalAmount { get; private set; } = 0m;
public void operator +=(int quantity)
{
TotalQuantity += quantity;
}
public void operator +=(decimal amount)
{
TotalAmount += amount;
}
}
The code snippet below shows how you can use the ShoppingCart class.
public class ShoppingCart
{
public int TotalQuantity { get; private set; } = 0;
public decimal TotalAmount { get; private set; } = 0m;
public void operator +=(int quantity)
{
TotalQuantity += quantity;
}
public void operator +=(decimal amount)
{
TotalAmount += amount;
}
}
Thanks to user-defined compound assignment operators, we get cleaner, simpler, and more readable code.
Set TargetFramework to .NET 10
Naturally, you must have .NET 10 installed in your computer to work with C# 14. If you want to change your existing projects to use C# 14, you will need to set the TargetFramework to .NET 10 as shown in the code snippet given below.
Exe
preview
net10.0
enable
enable
You can learn more about the new features in C# 14 here and here.
The C# programming language has improved significantly since its initial release as part of Visual Studio .NET 2002. That was a time when you had to write a lot of verbose code to create C# programs. The new features introduced in C# 14 promise to boost your productivity and help you write cleaner, more maintainable, and more performant code. Whether you’re building an enterprise application, a mobile application, or a web application, this new version of C# provides you with all you need to create world-class contemporary applications.
Three web security blind spots in mobile DevSecOps pipelines 26 Feb 2026, 1:00 am
We know that mobile development in 2025 was different. It shifted from a “front-end” concern to a massive, distributed headache in which the most vulnerable component could be any unmanaged, hostile endpoint. In fact, 43% of organizational breaches originate at the mobile edge.
The problem lies with the outdated web-centric security models that app developers rely on. With mobile platforms operating under fundamentally different trust assumptions, their DevSecOps pipelines need to account for these explicitly.
Here are three technical blind spots that current pipelines often fail to address, and that modern DevSecOps engineers should watch out for.
Blind spot #1: Vulnerability to man-at-the-end attacks
In web-first development, the server is the ultimate “fortress.” Because we control the hardware and software environment, security is focused on sanitizing inputs and hardening the perimeter. Traditional web-centric SAST (static application security testing) tools are designed for this model. They scan for logical flaws in the server binary, assuming the binary itself remains protected within the fortress. On the web, the “don’t trust your client” strategy is easily maintained because the client-side code typically has limited features and can be ephemeral.
In comparison, a mobile app is a “messenger in enemy territory.” The device and the end-user cannot be trusted, as the app binary is physically in the attacker’s hands. Unlike web servers, mobile clients are often responsible for more complex local functions, creating a much larger surface. An attacker can tamper with the binary through repackaging or use tools like Frida to perform dynamic instrumentation to bypass security controls in real time. Because web-centric SAST tools assume the binary is safe in a fortress, they often overlook these critical mobile-specific vulnerabilities and tampering scenarios.
Frida injects a JavaScript engine into the target process’s memory space, allowing an attacker to intercept function calls in real time. Specifically, it leverages inline hooking and PLT/GOT (procedure linkage table/global offset table) interception. It allows the user to redirect the execution of the application code to attacker-controlled code.
While static measures like control flow flattening (modifying the graph of a function to hide its logic) and symbol stripping (removing function names) increase the cost of initial analysis, they cannot stop a dynamic tool like Frida once the attacker identifies the correct memory offsets.
To counter these threats, developers need to do more than obfuscation. They need to add RASP (runtime application self-protection), which monitors the application’s state while it is running. RASP includes:
- Hooking framework detection: Most hooking framework leaves “artifacts” behind. So, a classical technique to detect them consists of looking for them. For example, Frida often communicates via specific default ports (e.g., 27042) or named pipes. Check /proc/self/maps to see if unauthorized .so or .dylib files (like frida-agent.so) have been injected into the process space. However, such detections are useful only as a first layer of defense. Attackers can bypass them quite easily by, for instance replacing “frida” strings by “grida” or changing the port used.
- Anti-tamper and hook detection: In addition to the framework detection, the app should actively scan its own memory. For example, it should periodically check the first few bytes of critical functions for “jump” or “breakpoint” instructions (
0xE9or0xCCon x86) that indicate a trampoline has been inserted. Perform integrity checks on the.textsection of the binary in memory to ensure it matches the signed disk version. - Hardware-backed attestation: This provides a zero-trust verification of the client environment using the OS as a source of truth. Services such as the Android Play Integrity API generate a signed cryptographic token from the OS manufacturer. This token verifies that the binary is unmodified, the device isn’t rooted, and a debugger hasn’t compromised the environment before the back end grants access to sensitive resources.
Blind spot #2: Misunderstanding hardware-backed cryptography
Misuse of local device storage creates a common architectural blind spot. Standard encryption libraries often store the master key in the app’s private directory. It may be technically encrypted, but it also creates a blind spot, making the approach equivalent to leaving your house key under the doormat.
EncryptedSharedPreferences and the iOS Keychain are not magic bullets. If these are not explicitly configured to be hardware-backed, the keys remain in the software layer. On a rooted device, an attacker could perform a memory dump or use an Android device backup exploit to extract the keys and decrypt the entire local database. The OS’s “private” sandbox is only as secure as the kernel, and on many user devices, the kernel is an open book.
To address this blind spot, developers must enforce cryptographic binding to the hardware:
- TEE (trusted execution environment) and secure enclave integration: Force keys to be generated and stored within the TEE or secure enclave. This ensures that the private key never enters the application’s memory space. The app sends data to the hardware, the hardware signs or decrypts it and returns the result.
- User-presence requirements: For high-security apps (such as those developed for fintech or health care), the cryptographic key is to be unlocked only by a successful biometric prompt. So even if a device is stolen while “unlocked,” the app’s sensitive data remains cryptographically inaccessible without a secondary “proof of presence.”
Blind spot #3: Managing the logic entropy of AI assistants
The rise of AI-assisted vibe coding is introducing a new class of logic entropy. Gartner’s projection that 90% of engineers will use AI assistants by 2028 creates a systemic risk: the proliferation of “insecure by default” boilerplate.
AI models are trained on vast amounts of legacy code. When you ask an AI to implement a network call, it often ignores certificate pinning. Sometimes, it uses deprecated TLS (Transport Layer Security) versions because those patterns are statistically more common in its training set. For example, Stanford researchers found that AI-assisted developers are 80% more likely to produce code with vulnerabilities like plaintext credentials or insecure random number generators.
Furthermore, AI can “hallucinate” security configurations. This suggests that nonexistent parameters that appear valid can cause the OS to default to a “fail-open” state. A penetration test of AI-generated mobile code often reveals “shadow logic” that implements complex encryption. Still, the IV (initialization vector) is hardcoded, making the cryptography vulnerable to modern GPU-based brute-force attacks.
DevOps teams need to treat AI as an untrusted contributor:
- Custom linting or analysis for crypto-primitives: Implement custom rules (e.g., using mast tools or custom linting) that specifically target the usage of
AllowAllHostnameVerifierorInsecureTrustManager, which are common AI “shortcuts” to make code work. - SBOM (software bill of materials) enforcement: Developers must run an SBOM check to validate every dependency against a vulnerability database before entering the build stage.
Soon-to-be blind spot: An iOS sideloading surge
In 2026, the abuse of Enterprise Provisioning Profiles will become an additional blind spot. To comply with regulations such as the Digital Markets Act, platforms opened “sideloading” channels. This is not new for Android, but it is also becoming relevant for iOS platforms as they now have to support alternative app stores. So, while this helped with internal distribution, it has become a primary vector for repackaging attacks.
Sideloading itself is not the problem. The risk emerges when applications cannot verify their own integrity at runtime. Attackers can take a legitimate app, inject a malicious library (using the memory hooking techniques mentioned above), and re-sign it with a leaked or stolen enterprise certificate. Since the app is signed with a valid Apple or Google-issued developer certificate, it can bypass many OS-level warnings, leading users to install “cracked” versions that are actually surveillanceware.
App developers must monitor for certificate mismatch. Your app should self-verify the fingerprint of its signing certificate at runtime by comparing the active signing key against an embedded hash of your official production key. If the fingerprint doesn’t match, the app should assume it has been repackaged and immediately invalidate all local user sessions and clear the hardware-backed keystore.
Building for a hostile runtime
It’s common for developers to complain about security taxing performance. RASP checks increase the main thread’s load and can cause frame drops during UI transitions. Hardware-backed encryption adds latency to disk I/O as data must move across the bus to the processor.
Despite these hurdles, with 75% of organizations increasing mobile security spend, the industry is acknowledging that this “performance tax” is significantly cheaper than the average cost of a breach.
In 2026, a robust mobile pipeline doesn’t just “check for bugs” but assumes the app is being run in a laboratory by a malicious actor. Our job is to make the cost of data extraction higher than the value of the data itself.
—
New Tech Forum provides a venue for technology leaders—including vendors and other outside contributors—to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all inquiries to doug_dineley@foundryco.com.
Abandoned project linking Java, JavaScript makes a comeback 25 Feb 2026, 4:59 pm
Once envisioned as a bridge between Java and JavaScript, the Detroit project never got off the ground. Now, there are efforts at reviving it, adding a Python engine to the mix.
Intended to enable using JavaScript as an extension language for Java applications, the Detroit project fizzled out after losing its sponsoring group around 2018. But according to a new proposal dated February 25, there still is interest in bringing Java and JavaScript together. The proposal was gathering steam on an OpenJDK mailing list this week.
List participant Sundararajan Athijegannathan, who has offered to lead the project, wrote that “there is also interest in accessing AI functionality written in Python from Java applications.” In addition to extending JavaScript to Java applications, Java libraries would be accessed from JavaScript applications, according to Athijegannathan.
The Detroit project prototype, which involved developing a native implementation of the javax.script package based on the Chrome V8 JavaScript engine, has been revived, Athijegannathan said. Participants also have prototyped a Python script engine based on CPython. Using widely adopted JavaScript and Python implementations, rather than re-implementing the languages from scratch, ensures low long-term maintenance costs and compatibility with existing JavaScript and Python code, Athijegannathan wrote.
“We would like to move these prototypes into a proper OpenJDK project in order to accelerate development. We expect to leverage and push the boundaries of the FFM (Foreign Function & Memory) API, so this work will likely influence Project Panama,” he wrote. Panama looks to improve connections between the JVM and non-Java APIs. Over time, the project may consider implementing script engines for additional languages. Votes on the project, from current OpenJDK members only, are due by March 11.
Page processed in 0.046 seconds.
Powered by SimplePie 1.3, Build 20180209064251. Run the SimplePie Compatibility Test. SimplePie is © 2004–2026, Ryan Parman and Geoffrey Sneddon, and licensed under the BSD License.
