Supercharge Your File Transfers: Understanding ODX (Offloaded Data Transfer) in Windows

Slow file copying and moving operations can be a major bottleneck, especially when dealing with large files on servers. Windows Offloaded Data Transfer (ODX) technology is designed to solve this problem by dramatically speeding up these processes. If you’re looking to optimize your server’s performance, understanding ODX is crucial.

This article dives deep into Windows ODX, explaining how this powerful feature works from a storage perspective. We’ll explore the benefits of ODX, how to identify if your storage supports it, and how it can revolutionize your data management.

What is ODX and How Does it Work?

Imagine copying a massive file – traditionally, your server would read the data from the source storage, transfer it over the network or within the system, and then write it to the destination storage. This process consumes server resources and network bandwidth.

ODX, or Offloaded Data Transfer, changes the game. Instead of your server handling all the data movement, ODX offloads this task to your storage array. Think of it as giving your storage array the instruction to copy the data directly, freeing up your server to handle other tasks.

This magic happens through “tokens.” Let’s break down the ODX copy operation step-by-step, as illustrated in the diagram below:

Alt text: Diagram illustrating the four-step Offloaded Data Transfer (ODX) process: 1. Application sends offload read request. 2. Source copy manager returns a token (ROD). 3. Application sends offload write request with token. 4. Storage array moves data and returns result.

The Copy Request: When you initiate a copy or move operation, the application sends an “offload read request” to the source storage device through the copy manager.
Token Generation: The source storage device’s copy manager doesn’t send back the actual data. Instead, it generates a “token,” also known as a Representation of Data (ROD). This token is essentially a pointer to the data that needs to be copied.
Write Request with Token: The application then sends an “offload write request,” along with the token, to the destination storage device’s copy manager.
Storage Array Data Movement: Here’s where the offloading occurs. The storage array’s copy manager uses the token to directly move the data from the source storage location to the destination storage location, without the server being heavily involved in the data stream. Finally, the storage array sends the offload write result back to the application.

This token-based system is the core of ODX and allows for significantly faster copy and move operations, especially for large files and in virtualized environments.

Identifying ODX Capability: Is Your Storage Ready?

To leverage the power of ODX, both your source and destination storage devices must be “ODX-capable.” This means they need to support the T10 standard specifications for ODX, including offload read and write operations with tokens.

Windows automatically detects ODX capability during the system boot process or when a new storage device is connected. It does this by:

Querying Copy Offload Capability: Windows interrogates the storage target device to determine if it supports ODX.
Gathering Parameters and Limitations: If ODX is supported, Windows gathers information about the specific parameters and limitations for copy offload operations that the storage device imposes.

By default, Windows intelligently tries to use the ODX path whenever possible. If both the source and destination storage are ODX-capable, Windows will prioritize ODX for copy operations. If the initial ODX request fails for any reason, Windows will recognize this specific source-destination combination as “not ODX capable” and revert to the traditional, legacy copy method for future operations between these locations.

ODX Read and Write Operations: Deep Dive

Let’s delve deeper into the technical aspects of ODX read and write operations, understanding the commands and processes involved.

Synchronous Commands and APIs

To ensure reliable and robust operation, especially with large data transfers, ODX utilizes synchronous offload read and write SCSI commands. These commands simplify complex scenarios like MPIO (Multi-Path I/O) and cluster failovers. Windows expects these synchronous commands to complete within 4 seconds, ensuring responsiveness.

For developers and system administrators who need to interact directly with storage arrays and initiate copy offload operations, Windows provides several APIs:

FSCTL (File System Control): Allows file system level control over ODX operations.
DSM IOCTL (Device Specific Method I/O Control): Provides device-specific control for storage management.
SCSI_PASS_THROUGH: Enables direct communication with SCSI devices, allowing for fine-grained control over ODX commands.

It’s crucial to note that Windows restricts applications from directly writing to a file system-mounted volume without exclusive access. This is to prevent data corruption or system instability that could arise from conflicts between application writes and file system operations.

Offload Read Operations: Getting the Token

The offload read operation is the first step in the ODX process. When an application requests an offload read, it can specify a “token lifetime.” This lifetime determines how long the generated token remains valid. If the application sets the token lifetime to zero, the storage array’s default inactivity timer is used.

The storage array’s copy manager is responsible for maintaining and validating the token based on its lifetime and associated credentials. Windows also imposes a limit of 64 file fragments per offload read request. If this limit is exceeded, the ODX request will fail, and Windows will fall back to the traditional copy method.

Upon successful completion of the offload read request, the copy manager creates a Representation of Data (ROD) token. This token represents a point-in-time snapshot of the user data and protection information. The ROD token can represent user data in either “open exclusively” or “open with share” format, influencing token invalidation policies.

“Open Exclusively”: If the ROD is “open exclusively,” the token may be invalidated if the data is modified or moved.
“Open with Share”: If the ROD is “open with share,” the token remains valid even if the data is modified.

A ROD token is a 512-byte structure with the following format:

Size in Bytes	Token Contents
4	ROD Token Type
508	ROD Token ID

The ROD token is opaque and unique, generated and used solely by the storage array, enhancing security. If a token is modified, invalid, or expired, the copy manager can invalidate it during the subsequent offload write operation. The token also includes an inactivity timeout value, indicating how long the storage array will keep the token valid for a subsequent “Write Using Token” operation.

Offload Write Operations: Using the Token to Copy

Once the application has received the ROD token from the offload read operation, it initiates the offload write operation. It sends an “offload write request” along with the ROD token to the destination storage device’s copy manager. Similar to read operations, Windows expects synchronous offload write commands to complete within 4 seconds. Failures due to timeouts or errors will cause Windows to revert to the legacy copy operation.

Offload write operations can be completed in one or more steps using “Receive Offload Write Result” commands. If an offload write is partially completed, the copy manager provides an estimated delay and the number of transferred blocks to indicate progress. The copy manager can perform writes sequentially or using a scatter/gather pattern for optimized data placement.

In case of a write failure, the copy manager reports the progress in contiguous logical blocks up to the point of failure. The client application or copy engine can then resume the offload write from the point of failure. Upon successful completion, the copy manager confirms 100% data transfer and sets the estimated status update delay to zero. If the write result repeatedly returns the same progress count despite retries, Windows will eventually fall back to the legacy copy method.

Well-Known ROD Tokens: Zero Tokens

ODX also supports “well-known ROD tokens,” which are predefined tokens with known data patterns. A common example is a “zero token.” Applications can use zero tokens to efficiently fill ranges of logical blocks with zeros. This is especially useful for provisioning storage or securely erasing data.

In an offload write using a well-known token, the client application doesn’t need to perform an offload read to obtain the token. However, the storage array must support and recognize the well-known token. If a well-known token is not supported, the storage array will reject the offload write request with an “Invalid Token” error. Like regular ROD tokens, well-known tokens are 512 bytes and have a defined format:

Size in Bytes	Token Contents
4	ROD Token Type
2	Well Known Pattern
506	ROD Token ID

Performance Tuning Parameters of ODX Implementation

One of the key advantages of ODX is that its performance is largely independent of network link speeds between the client and server or the storage area network (SAN). The data movement happens directly within the storage array, managed by the copy manager and device servers.

However, ODX is not always beneficial for every copy operation. For very small files or in scenarios with extremely fast storage, the overhead of ODX might outweigh its benefits. For instance, a fast 1-Gbit iSCSI storage array might already copy a 3GB file in under 10 seconds, achieving transfer rates that saturate the network interface.

To optimize ODX performance, Windows implements several tuning parameters:

Minimum File Size: Windows sets a minimum file size of 256 KB for ODX operations. For files smaller than this, the system automatically falls back to the legacy copy process. This avoids the overhead of ODX for very small files where the benefit would be negligible.
Maximum Token Transfer Size and Optimal Transfer Count: Windows uses these parameters to determine the optimal transfer size for offload read and write SCSI commands. The total data transfer size in a single command should not exceed the maximum token transfer size. If the storage array doesn’t specify an optimal transfer count, Windows uses a default of 64 MB.

By adhering to these parameters, applications can maximize the performance benefits of ODX for suitable workloads.

ODX Error Handling and High Availability Support

Robust error handling and high availability are critical for any storage technology, and ODX is no exception.

ODX Error Handling

ODX incorporates a robust error handling mechanism. If an ODX operation fails during a file copy request, Windows gracefully falls back to the traditional legacy copy operation. Specifically, if an offload write operation fails mid-transfer, Windows and NTFS resume the copy using the legacy method, starting from the point of failure.

Furthermore, after an ODX failure, NTFS marks the source and destination LUN (Logical Unit Number) as “not ODX-capable” for a period of three minutes. During this time, subsequent copy operations between these LUNs will use the legacy method. After three minutes, Windows will automatically retry using ODX for copy operations involving these LUNs. This mechanism allows storage arrays to temporarily disable ODX in specific paths during high-stress situations, ensuring overall system stability.

ODX Failover in MPIO and Cluster Server Configurations

In environments utilizing MPIO or cluster server configurations, maintaining data integrity and availability during failovers is paramount. ODX operations are designed to be resilient in these scenarios. Offload read and write operations must be completed or canceled using the same storage link (I_T nexus).

When an MPIO path failover or a cluster server failover occurs during a synchronous ODX operation, Windows handles it as follows:

MPIO Path Failover: If an MPIO path fails, Windows retries the failed ODX command. If the command fails again, Windows takes further action:
- Cluster Server: If part of a cluster server, Windows initiates a cluster node failover.
- Non-Cluster Server: If not in a cluster, Windows issues a LUN reset to the storage device and reports an I/O failure status to the application.
Cluster Server Failover: In a cluster server environment, the cluster storage service fails over to the next preferred cluster node and restarts the cluster storage service. Cluster-aware applications are expected to retry the ODX read/write command after the service failover.

In the event of repeated failures after MPIO path and cluster node failovers, Windows will issue a LUN reset to the storage device. This reset terminates all pending commands and operations on the LUN, ensuring a clean state after a major failure event. Currently, Windows does not utilize asynchronous offload read or write SCSI commands in its storage stack.

ODX Usage Models: Real-World Applications

ODX offers flexibility and can be applied in various scenarios, enhancing data management across different storage configurations.

ODX Across Physical Disks, Virtual Hard Disks, and SMB Shared Disks

For ODX to function, the application server needs read/write access to both the source and destination LUNs. The application initiates an offload read to the source LUN, obtains a token, and then uses that token to perform an offload write to the destination LUN. The storage array then handles the data transfer within the storage network. The following diagram illustrates the basic supported source and destination targets for ODX:

Alt text: Diagram showing ODX support across different storage types: Physical Disk to Physical Disk, Physical Disk to Virtual Hard Disk, Virtual Hard Disk to Virtual Hard Disk, and SMB Share to SMB Share.

ODX Operation with One Server

In a single-server setup, the server acts as both the source and destination for ODX operations.

The server (or VM) has access to both the source LUN (which can be a VHD or Physical Disk) and the destination LUN (also VHD or Physical Disk). The copy application on this server initiates the offload read and write requests. The storage array then moves data between the source and destination LUNs within the same storage system.

ODX Operation with Two Servers

ODX can also be used across two servers, particularly in configurations with shared storage managed by the same copy manager.

One server (or VM) hosts the source LUN, and another server (or VM) hosts the destination LUN. Both servers share their respective LUNs with an application client via SMB. This allows the application client to access both source and destination storage.
The source and destination storage arrays are managed by a unified copy manager within a SAN environment.
From the application client, an ODX copy operation is initiated. The client sends an offload read request to the source LUN, receives a token, and then sends an offload write request with the token to the destination LUN. The copy manager then orchestrates the data movement between the two storage arrays, potentially located in different physical locations.

Massive Data Migration

Data migration, especially for large datasets, can be time-consuming and resource-intensive. ODX offers a powerful solution for speeding up massive data migrations, such as moving databases, large file repositories, or entire virtual machine libraries. This is particularly useful during storage system upgrades, database engine migrations, or significant changes in application or business processes.

ODX-based data migration is possible when the copy manager of the new storage system can also manage the legacy storage system. In a typical scenario:

One server hosts the legacy storage system, and another hosts the new storage system. Both servers share their storage via SMB to a data migration application client.
Both the legacy and new storage systems are managed by the same copy manager within a SAN.
The data migration application client initiates ODX operations. It requests a token from the source (legacy) storage and uses it to write to the destination (new) storage. The copy manager handles the data transfer between the two different storage systems.
Importantly, massive data migration using ODX can also be performed on a single server if both source and destination storage are accessible.

Host-Controlled Data Transfer within a Tiered Storage Device

Tiered storage is a common strategy to optimize storage costs and performance by categorizing data based on access frequency, performance requirements, and protection needs. ODX facilitates host-controlled data migration within tiered storage environments, enabling efficient data lifecycle management.

Consider a two-tiered storage system:

A server hosts the tiered storage system. Tier1 storage is for high-performance, frequently accessed data, and Tier2 storage is for less frequently accessed, archival data.
A single copy manager manages all tiers of storage.
A data migration application on the server can use ODX to move data between tiers. It requests a token from the source Tier1 storage and uses it to write to the destination Tier2 storage. The copy manager handles the data movement between the tiers.
Once data migration to Tier2 is complete, the application can delete the data from Tier1, reclaiming valuable high-performance storage space.

Conclusion: Embrace the Speed of ODX

Windows Offloaded Data Transfer (ODX) is a game-changing technology for modern storage environments. By offloading data transfer operations to the storage array, ODX significantly reduces server load, minimizes network congestion, and dramatically accelerates file copy and move operations. Whether you’re managing a single server or a complex, clustered infrastructure, understanding and leveraging ODX can lead to substantial performance improvements and greater efficiency in your data management workflows. Check your storage array’s documentation to see if it supports ODX and unlock the potential for faster, more efficient data operations in your Windows environment.