Is 94% of your syslog just noise? Now you can filter it out before ingestion.

Jun 03, 2026 - 18:19

Updated: 26 days ago

0 4

Is 94% of your syslog just noise? Now you can filter it out before ingestion.

At Microsoft Build 2026, we are announcing the public preview of multi-stage transformations for Azure Monitor Data Collection Rules (DCRs). Multi-stage transformations let you filter, aggregate, parse, and map your logs at the point of collection, before data is ingested into your workspace. Processing happens in a defined sequence of steps called processors, and you can chain them together to build precise data pipelines that reduce ingestion volume, improve data quality, and lower monitoring costs.

Processors in orange run on the agent (client-side). The KQL transform in green runs in the ingestion pipeline. Data volume shrinks at each stage.

What are multi-stage transformations?

A Data Collection Rule defines how Azure Monitor collects, transforms, and routes telemetry data. Until now, DCRs supported a single KQL transformation step on the ingestion side. Multi-stage transformations extend this model by introducing a processor pipeline: an ordered sequence of processing steps that run on the agent (client-side) or at the ingestion endpoint (ingestion-side), or both.

Each processor performs one operation: filtering records, parsing structured fields from raw text, renaming or dropping columns, aggregating metrics, or running a KQL expression. Processors execute in order, and the output of one becomes the input to the next. This composable design replaces what previously required complex, monolithic KQL queries or external pre-processing scripts.

Client-side processors run on the Azure Monitor Agent before data leaves the source machine. This means filtered and aggregated data never crosses the network, reducing both egress and ingestion costs. Ingestion-side processors run in the Log Analytics ingestion pipeline and support KQL-based transformations for more complex logic.

Key applications

The most immediate use case is cost reduction. When you can filter records on the agent before they leave the machine, you stop paying for data you never query. Syslog is the classic example: in many environments, informational and debug messages make up the vast majority of volume, and none of it gets looked at unless something breaks. A single filter processor can cut that stream by 90% or more.

Aggregation is equally powerful for high-frequency telemetry. Performance counters sampled every 15 seconds produce millions of records per hour across a large fleet, but most dashboards and alert rules only need 5-minute granularity. Rolling up those samples on the agent, before they cross the network, dramatically reduces ingestion without losing the operational signal your team actually relies on.

Beyond cost, multi-stage transformations improve the quality of the data that does reach your workspace. Parsing structured fields out of raw text (JSON payloads, XML event data, CEF security logs) at collection time means downstream queries are simpler and faster. And because each processor handles one step in a readable sequence, maintaining the pipeline is far easier than debugging a single monolithic KQL expression that tries to do everything at once.

To make this concrete, let’s walk through the two highest-impact patterns we see with preview customers: filtering noisy syslog data and aggregating performance counters.

Filter data before ingestion

The filter processor evaluates each record against conditions you define and drops anything that does not match. Because filtering runs on the agent, dropped records are never serialized, transmitted, or ingested. This makes it the highest-impact processor for cost reduction.

You configure filters using simple field-level conditions: specify a column name, an operator (equals, not equals, greater than, contains, etc.), and a value. Conditions can be combined with AND/OR logic for precise control.

Scenario: Keep only warning-and-above syslog messages

A typical syslog stream generates thousands of informational and debug messages for every actionable warning or error. With a filter processor, you set a severity threshold, and the agent drops everything below it before transmission.

In this example, the filter keeps records where SeverityNumber >= 4 (Warning). The 57,000 debug and informational records per hour are dropped on the machine. Only the 3,250 actionable records are transmitted and ingested, a 94% reduction in syslog volume.

Filters also support compound conditions. For example, you can keep auth-facility errors OR any critical message regardless of facility, all in a single processor step. This kind of targeted filtering is especially useful for security teams that need specific event categories without paying for the full syslog firehose.

Aggregate logs before ingestion

The aggregate processor rolls up high-frequency records into time-windowed summaries on the agent. This is especially valuable for performance counters, heartbeat signals, and any telemetry where per-second granularity is not needed for operational decisions.

You configure the processor with a time window (for example, 5 minutes), the aggregation operators to apply (average, sum, min, max, count), and the dimension columns to group by (such as host name and counter name). The agent collects records within each window, computes the aggregates, and emits one summary record per group.

Scenario: Roll up performance counters into 5-minute summaries

A fleet of 500 VMs, each reporting 10 performance counters every 15 seconds, generates roughly 2 million raw records per hour. Most operational dashboards and alert rules use 5-minute granularity, making the per-sample detail redundant.

With the aggregate processor, each agent rolls up its local counter stream into 5-minute windows, grouped by counter name. Each summary record contains the average, maximum, and sample count for that window.

	Raw data	After aggregation (5-min windows)
Records per VM per hour	2,400 (10 counters x 4/min x 60 min)	120 (10 counters x 12 windows)
Records across 500 VMs per hour	1,200,000	60,000
Volume reduction		95%
Operational fidelity	Per-sample (15s)	Avg, max, and count per 5 min

Because the aggregation runs on the agent, the reduced data set is what gets transmitted and ingested. Dashboards and alerts that rely on 5-minute granularity work identically, but ingestion costs drop by 95%. Route the output to a custom table with columns that match the aggregate output (average, max, count, and your dimension columns).

Chain processors for complete pipelines

Processors are composable. A common pattern chains a header processor (to convert raw data into tabular format), a filter (to drop irrelevant records), a parse step (to extract fields from structured payloads), and a column drop (to remove fields not needed downstream).

Scenario: Parse, filter, and slim down Windows Event logs

Consider a security team that needs logon success and failure events (Event IDs 4624 and 4625) from the Windows Security log. The raw event stream contains hundreds of event types, each carrying a large XML payload. A four-step pipeline handles this:

Header processor converts the raw event stream into tabular rows
Parse processor extracts EventID and TargetUser from the XML payload into typed columns
Filter processor keeps only logon success (4624) and failure (4625) events, dropping everything else
Drop processor removes the bulky RawXml and RenderingInfo columns that are no longer needed

The result is a lean, security-focused data set containing only the events and fields the team actually queries. Each step is independent and can be modified without affecting the others.

Authoring multi-stage DCRs

Multi-stage transformations are available through the Azure portal and through the REST API (version 2025-05-11). The portal provides a visual editor for building processor pipelines, previewing the schema at each stage, and validating the configuration before deployment.

The Transform tab in the DCR data source configuration lets you add processors at each stage and preview the resulting schema.

For infrastructure-as-code workflows, the full DCR JSON can be authored and deployed via ARM templates, Bicep, or direct REST API calls.

To get started:

Open Azure Monitor in the Azure portal and navigate to Data Collection Rules
Create a new DCR or edit an existing one
In the data source configuration, select Edit transformation
Author your transformation logic across client and ingestion stages using the set of available processors
Preview the schema output at each stage to verify the pipeline produces the expected result
Save and associate the DCR with your target resources

Preview notes:

Multi-stage transformations are available in public preview starting June 3, 2026
Client-side processors require Azure Monitor Agent version 1.35 or later
Aggregation output must be routed to custom tables (standard table schemas do not match aggregate output)
Data collection, workspace ingestion, and alert rules may incur costs based on the settings you enable. Preview pricing may differ from general availability pricing. See Azure Monitor pricing for current rates

To learn more, see:

Data Collection Rules overview

Looking ahead

Multi-stage transformations are part of our continued investment in giving teams control over their data before it reaches the workspace. During the preview period, we plan to expand processor coverage, add support for additional data source types, and incorporate user feedback into the authoring and validation experience.

We are also exploring how multi-stage transformations can serve as the foundation for advanced scenarios such as data scrubbing, inline enrichment from external reference data, and AI-assisted pipeline authoring. These capabilities will build on the same processor model, so pipelines you create today will extend naturally as new processors become available.

We welcome your feedback as you try multi-stage transformations. Use the feedback options in the Azure portal, or reach out through your Microsoft account team.

This feature is currently in preview. Previews are provided "as-is," "with all faults," and "as available," and are excluded from the service level agreements and limited warranty. For more information, see Supplemental Terms of Use for Microsoft Azure Previews].

Statements in this post about future plans and capabilities represent our current intentions and are subject to change. They should not be relied upon when making purchasing decisions.

Understanding Azure Event Grid and Modern Messaging Patterns

What's Your Reaction?

Like 0

Dislike 0

Love 0

Funny 0

Wow 0

Sad 0

Angry 0

Christopher Holloway

Christopher Holloway is the founder and director of Progressive Robot, a UK-based technology company. A full-stack engineer with more than two decades of experience, he works across PHP development, ecommerce, Linux infrastructure, technical SEO and AI automation, and writes here on technology, AI, hardware and software.

Prime Day Tech Deals: Galaxy S26 Ultra and Echo Discounts

NVIDIA Blackwell Dominates MLPerf Training...

HPE and NVIDIA Expand AI Infrastructure...

Benchmarking Agentic AI Infrastructure:...

Why Artificial Intelligence Has Not...

Asus ROG Ally X20 Review: OLED Refinement...

Gran Turismo World Series Singapore:...

007 First Light Sets New Sales Record...

Summer Game Fest 2026: Industry Shifts...

iPhone 18 Pro Color Confirmed: Dark...

The Complete Guide to MagSafe and Magnetic...

Understanding the Reality Behind the...

Mobile Document Scanning: Evaluating...

Apple Launches New Accessories And Thinnest...

Beats Studio Buds Firmware Update Addresses...

Apple Updates AirPods Pro and Beats...

Apple Distributes Routine Firmware Updates...

Apple A22 Pro Chipset and the 1.4nm...

Apple 2027 Roadmap: Camera AirPods and...

HPE and NVIDIA Expand AI Infrastructure...

NVIDIA Blackwell Sets New Standards...

Why Storage Infrastructure Is Essential...

HPE Updates AI Infrastructure for Agentic...

HPE Expands Self-Driving Networks for...

HPE Broadens Quantum Partnerships to...

AMD AGESA 1.3.0.1b BIOS Update Improves...

MSI MPG 271KRAW18 5K Mini LED Monitor...

AMD Warranty Dispute Highlights Evolving...

MSI Forecasts Persistent Memory And...

Domestic 24 Gb Chips Enable 48 GB DDR5...

DDR5 Memory Prices Surge in Germany,...

Intel Raptor Lake Next Desktop CPUs...

Intel Extends Raptor Lake Lifecycle...

Arctic Computex 2026 Cooling and Chassis...

Adata XPG Computex 2026 Hardware Lineup...

Compact NCase P1 ATX Chassis for Multi-GPU...

Lian Li Computex 2026 Hardware Innovations...

Mini PC Buying Guide: Performance, Value,...

Compact Desktop Systems: Architecture,...

PC Hardware Transition Guide: Migration,...

Asus ROG Edition 20 Desktop Balances...

MSI Unveils Pro Max Desktops and Monitors...

Intel Core-X Series and X299 Platform...

Intel Core i9-7980XE Benchmarks Reveal...

MSI Introduces Vigor GK80 and GK70 Keyboards...

Optimizing Chiplet Cooling With Adjustable...

How Modern Security Suites Replace Multiple...

Red Hat NPM Channel Compromised in Supply...

How Malvertising Campaigns Exploit Trusted...

AI doesn't break security. Complexity...

Meta AI Chatbot Exploit Compromises...

Scientific Insights From Overlooked...

Space Market Correction as SpaceX IPO...

Negative Time in Quantum Optics: Peer-Reviewed...

How Underwater Technology Is Reshaping...

Why Night Driving Poses Unique Risks...

Anker Prime 250W Charging Station Review...

Tesla Model 3 Pricing Shift in Canada...

How AI and Machine Learning Are Reshaping...

Singapore Airlines Brings Live World...

Dolby Atmos Changed Movie Audio: Why...

Clarkson's Farm Season 5 Release Schedule...

Masters of the Universe Director Addresses...

Google Engineer Charged With Insider...

Fake downloads of popular PC utilities...

Pearl Cryptocurrency Mining Rush Fades...

Physical Attacks Against Major Cryptocurrency...

Coinbase and Kalshi introduce perpetual...

Welcome!