Azure Data Services

The following is from Azure Administrator Training lab for AZ-103

CDN Benefits

A content delivery network (CDN) is a distributed network of servers that canefficiently deliver content to users. CDNs store cached content on edgeservers that are close to end-users.

CDNs are typically used to deliver static content such as images, style sheets,documents, client-side scripts, and HTML pages. The main benefits of using aCDN are:

  • Lower latency and faster delivery of content to users, regardless of theirgeographical location in relation to the datacenter where the applicationis hosted.
  • Helps to reduce load on a server or application, because it does not haveto service requests for the content that is hosted in the CDN.

Typical uses for a CDN include:

  • Delivering static resources for client applications, often from a website.
  • Delivering public static and shared content to devices such as cellphones and tablet computers.
  • Serving entire websites that consist of only public static content toclients, without requiring any dedicated compute resources.
  • Streaming video files to the client on demand.
  • Generally improving the experience for users, especially those locatedfar from the datacenter hosting the application.
  • Supporting IoT (Internet of Things) solutions, such as distributingfirmware updates.
  • Coping with peaks and surges in demand without requiring theapplication to scale, avoiding the consequent increased running costs.

✔️ CDN provides a faster, more responsive user experience. Do you thinkyour organization would be interested in this feature?

✔️ Use the following link to review some of the challenges with deployingCDN including security, deployment, versioning, and testing.

For more information, you can see:

Content Delivery Network Documentation – https://docs.microsoft.com/en-us/azure/cdn/cdn-overview

How CDN Works

You can enable Azure Content Delivery Network to cache content for the user.The Azure CDN is designed to send audio, video, images, and other filesfaster and more reliably to customers using servers that are closest to theusers. This dramatically increases speed and availability, resulting insignificant user experience improvements.

 

 

  1. A user (Alice) requests a file (also called an asset) using a URL with aspecial domain name, such as endpointname.azureedge.net. DNS routesthe request to the best performing Point-of-Presence (POP) location,which is usually the POP that is geographically closest to the user.
  2. If the edge servers in the POP do not have the file in their cache, theedge server requests the file from the origin. The origin can be an AzureWeb App, Azure Cloud Service, Azure Storage account, or any publiclyaccessible web server.
  3. The origin returns the file to the edge server, including optional HTTPheaders describing the file’s Time-to-Live (TTL).
  4. The edge server caches the file and returns the file to the originalrequestor (Alice). The file remains cached on the edge server until theTTL expires. Azure CDN automatically applies a default TTL of sevendays unless you’ve set up caching rules in the Azure portal.
  5. Additional users may then request the same file using that same URLand may also be directed to that same POP.
  6. If the TTL for the file hasn’t expired, the edge server returns the file fromthe cache.

✔️ After you enable CDN access to a storage account, all publicly availableobjects are eligible for CDN edge caching. If you modify an object that’scurrently cached in the CDN, the updated content will not be available viaCDN until CDN refreshes its content after the time-to-live period for thecached content expires.

 

CDN Profiles

A CDN profile is a collection of CDN endpoints with the same pricing tierand provider (origin). You may create multiple profiles to organize endpoints.For example, you could have profiles with endpoints to different internetdomains, web applications, or storage accounts. You can create up to 8 CDNprofiles per subscription.

 

 

You can create a CDN profile from the Azure portal.

The CDN service is global and not bound to a location, however you mustspecify a resource group location where the metadata associated with theCDN profile will reside. This location will not have any impact on the runtimeavailability of your profile.

Several pricing tiers are available. At the time of this writing, there are fourtiers: Azure CDN Standard from Microsoft, Azure CDN Standard fromAkamai, Azure CDN Standard from Verizon, and Azure CDN Premium fromVerizon. Pricing is based on TBs of outbound data transfers.

Notice you can create your first profile endpoint directly from this blade (lastcheckbox, not shown).

✔️ Can you think of different scenarios that would require different CDNprofiles?

 

CDN Endpoints

When you create a new CDN endpoint directly from the CDN profile bladeyou are prompted for CDN endpoint name, Origin type, and Origin hostname.To access cached content on the CDN, use the CDN URL provided in theportal. In this case,

ASHStorage.azureedge.net/<myPublicContainer>/<BlobName>

 

There are four choices for Origin type: Storage, Cloud Service, Web App, andCustom origin. In this course we are focusing on storage CDNs.

When you select Storage as the Origin type, the new CDN endpoint uses thehost name of your storage account as the origin server.

There are additional CDN features for your delivery, such as compression,query string, and geo filtering. You can also add custom domain mapping toyour CDN endpoint and enable custom domain HTTPS. These options areconfigured in the Settings blade for the endpoint.

✔️ Because it takes time for the registration to propagate, the endpoint isn’timmediately available for use. For Azure CDN from Akamai profiles,propagation usually completes within one minute. For Azure CDN fromVerizon profiles, propagation usually completes within 90 minutes, but insome cases can take longer.

 

CDN Time-to-Live

Any publicly accessible blob content can be cached in Azure CDN until itstime-to-live (TTL) elapses. The TTL is determined by Cache-directiveheaders in the HTTP response from the origin server. If the Cache-Controlheader does not provide the TTL information, or if you prefer, you can configure caching rules to set the Cache Expiration Duration.

  • Global caching rules. You can set the Cache Expiration Duration for each endpoint in your profile, which affects all requests to the endpoint. TTL is configured as days, hours, minutes, and seconds.

 

  • Custom caching rules. You can also create custom caching rules for each endpoint in your profile. Custom caching rules match specific paths and file extensions, are processed in order, and override the global caching rule.

 

CDN Compression

File compression is a simple and effective method to improve file transferspeed and increase page-load performance by reducing a file’s size before itis sent from the server. File compression can reduce bandwidth costs andprovide a more responsive experience for your users.

There are two ways to enable file compression:

  • Enable compression on your origin server. In this case, the CDN passesalong the compressed files and delivers them to clients that request them.
  • Enable compression directly on the CDN edge servers. In this case, theCDN compresses the files and serves them to end users.

 

Enabling compression in the standard tiers

In the Azure portal, you can enable Compression and modify the MIME typeslist to tune which content formats to compress.

 

✔️Although, it is not recommended to apply compression to compressedformats, for example, ZIP, MP3, MP4, or JPG.

 

Azure File Sync

Use Azure File Sync to centralize your organization’s file shares in AzureFiles, while keeping the flexibility, performance, and compatibility of an on-premises file server. Azure File Sync transforms Windows Server into a quickcache of your Azure file share. You can use any protocol that’s available onWindows Server to access your data locally, including SMB, NFS, and FTPS.You can have as many caches as you need across the world.

 

There are many uses and advantages to file sync.

  1. Lift and shift. The ability to move applications that require accessbetween Azure and on-premises systems. Provide write access to thesame data across Windows Servers and Azure Files. This lets companieswith multiple offices have a need to share files with all offices.
  2. Branch Offices. Branch offices need to backup files, or you need tosetup a new server that will connect to Azure storage.
  3. Backup and Disaster Recovery. Once File Sync is implemented, AzureBackup will back up your on-premises data. Also, you can restore filemetadata immediately and recall data as needed for rapid disasterrecovery.
  4. File Archiving. Only recently accessed data is located on local servers.Non-used data moves to Azure in what is called Cloud Tiering.

✔️ Cloud tiering is an optional feature of Azure File Sync in which frequentlyaccessed files are cached locally on the server while all other files are tieredto Azure Files based on policy settings. When a file is tiered, the Azure FileSync file system replaces the file locally with a pointer, or reparse point. Thereparse point represents a URL to the file in Azure Files. When a user opens atiered file, Azure File Sync seamlessly recalls the file data from Azure Fileswithout the user needing to know that the file is actually stored in Azure.Cloud Tiering files will have greyed icons with an offline O file attribute to letthe user know the file is only in Azure.

For more information, you can see:

Planning for an Azure File Sync deployment – https://docs.microsoft.com/en-us/azure/storage/files/storage-sync-files-planning

 

File Sync Components

To gain the most from Azure File Sync, it’s important to understand the terminology.

Storage Sync Service. The Storage Sync Service is the top-level Azure resource for Azure File Sync. The Storage Sync Service resource is a peer of the storage account resource, and can similarly be deployed to Azure resource groups. A distinct top-level resource from the storage account resource is required because the Storage Sync Service can create sync relationships with multiple storage accounts via multiple sync groups. A subscription can have multiple Storage Sync Service resources deployed.

Sync group. A sync group defines the sync topology for a set of files.Endpoints within a sync group are kept in sync with each other. If, for example, you have two distinct sets of files that you want to manage withAzure File Sync, you would create two sync groups and add different endpoints to each sync group. A Storage Sync Service can host as many sync groups as you need.

Registered server. The registered server object represents a trust relationship between your server (or cluster) and the Storage Sync Service. You can register as many servers to a Storage Sync Service instance as you want.However, a server (or cluster) can be registered with only one Storage SyncService at a time.

Azure File Sync agent. The Azure File Sync agent is a downloadable package that enables Windows Server to be synced with an Azure file share. The AzureFile Sync agent has three main components:

  • FileSyncSvc.exe: The background Windows service that is responsible for monitoring changes on server endpoints, and for initiating sync sessions to Azure.
  • StorageSync.sys: The Azure File Sync file system filter, which is responsible for tiering files to Azure Files (when cloud tiering is enabled).
  • PowerShell management cmdlets: PowerShell cmdlets that you use to interact with the Microsoft.StorageSync Azure resource provider. You can find these at the following (default) locations:
    • C:\ProgramFiles\Azure\StorageSyncAgent\StorageSync.Management.PowerShell.Cmdlets.dll
    • C:\ProgramFiles\Azure\StorageSyncAgent\StorageSync.Management.ServerCmdlets.dll

 

Server endpoint. A server endpoint represents a specific location on a registered server, such as a folder on a server volume. Multiple server endpoints can exist on the same volume if their namespaces do not overlap (for example, F:\sync1 and F:\sync2). You can configure cloud tiering policies individually for each server endpoint. You can create a server endpoint via a mount point. Note, mount points within the server endpoint are skipped. You can create a server endpoint on the system volume but, there are two limitations if you do so:

  • Cloud tiering cannot be enabled.
  • Rapid namespace restore (where the system quickly brings down the entire namespace and then starts to recall content) is not performed.

Cloud endpoint. A cloud endpoint is an Azure file share that is part of a sync group. The entire Azure file share syncs, and an Azure file share can be a member of only one cloud endpoint. Therefore, an Azure file share can be a member of only one sync group. If you add an Azure file share that has an existing set of files as a cloud endpoint to a sync group, the existing files a remerged with any other files that are already on other endpoints in the sync group.

 

File Sync – Initial Steps

There are a few things that need to be configured before you synchronize your files.

 

  1. Deploy the Storage Sync Service. The Storage Sync Service can be deployed from the Azure portal. You will need to provide Name,Subscription, Resource Group, and Location.
  2. Prepare Windows Server to use with Azure File Sync. For each server that you intend to use with Azure File Sync, including server nodes in aFailover Cluster, you will need to configure the server. Preparation steps include temporarily disabling Internet Explorer Enhanced Security and ensuring you have latest PowerShell version.
  3. Install the Azure File Sync Agent. The Azure File Sync agent is a downloadable package that enables Windows Server to be synced with an Azure file share. The Azure File Sync agent installation package should install relatively quickly. We recommend that you keep the default installation path and that you enable Microsoft Update to keep AzureFile Sync up to date.
  4. Register Windows Server with Storage Sync Service. When the AzureFile Sync agent installation is finished, the Server Registration UI automatically opens. Registering Windows Server with a Storage SyncService establishes a trust relationship between your server (or cluster)and the Storage Sync Service. Registration requires your SubscriptionID, Resource Group, and Storage Sync Service (created in step one). A server (or cluster) can be registered with only one Storage Sync Service at a time.

✔️ Continue to the next topic for an explanation of how files are synchronized.

File Sync – Synchronization

Before synchronizing your files, you will need to do two other things.

Create a sync group with a cloud endpoint

In this step you will create a sync group with at least one cloud endpoint. The cloud endpoint is a pointer to an Azure file share. All server endpoints will sync with a cloud endpoint, making the cloud endpoint the hub. The storage account for the Azure file share must be located in the same region as theStorage Sync Service. Notice you will need a storage account and a file share.

The entirety of the Azure file share will be synced, with one exception: A special folder, comparable to the hidden “System Volume Information” folder on an NTFS volume, will be provisioned. This directory is called”.SystemShareInformation”. It contains important sync metadata that will not sync to other endpoints. Do not use or delete it!

 

Create server endpoints

Creating a server endpoint requires:

  • Registered server. The name of the server or cluster where you want to create the server endpoint.
  • Path. The Windows Server path to be synced as part of the sync group.The path should not be the root volume.
  • Cloud Tiering. A switch to enable or disable cloud tiering. Regardless of whether cloud tiering is enabled, your Azure file share always has a complete copy of the data in the sync group.
  • Volume Free Space. The amount of free space to reserve on the volume on which the server endpoint is located. For example, if volume free space is set to 50% on a volume that has a single server endpoint, roughly half the amount of data is tiered to Azure Files.

✔️ Azure File Sync moves file data and metadata exclusively over HTTPSand requires port 443 to be open outbound. Based on policies in your datacenter, branch or region, further restricting traffic over port 443 to specific domains may be desired or required.

✔️ There is a lot to consider when synchronizing large amounts of files. For example, you may want to copy the server files to the Azure file share before you configure file sync. You will on-board and monitor File Sync in the lab.

 

Import and Export Service

Azure Import/Export service is used to securely import large amounts of data to Azure Blob storage and Azure Files by shipping disk drives to an Azure datacenter. This service can also be used to transfer data from Azure Blob storage to disk drives and ship to your on-premises sites. Data from one or more disk drives can be imported either to Azure Blob storage or Azure Files.With the Azure Import/Export service, you supply your own disk drives and transfer data yourself.

Consider using Azure Import/Export service when uploading or downloading data over the network is too slow or getting additional network bandwidth is cost-prohibitive. Scenarios where this would be useful include:

  • Migrating data to the cloud. Move large amounts of data to Azurequickly and cost effectively.
  • Content distribution. Quickly send data to your customer sites.
  • Backup. Take backups of your on-premises data to store in Azure blobstorage.
  • Data recovery. Recover large amount of data stored in blob storage andhave it delivered to your on-premises location.

✔️ A single job can include up to 10 disks. You can create jobs directly fromthe Azure portal. You can also accomplish this programmatically by usingAzure Storage Import/Export REST API.

For more information, you can see:

Azure Import and Export Service – https://azure.microsoft.com/en-us/documentation/articles/storage-import-export-service/

 

Components and Requirements

This topic lists the components that make up Import/Export service and therequirements for using the service.

Import and Export service components

  • Import/Export service. This service available in Azure portal helps theuser create and track data import (upload) and export (download) jobs.
  • WAImportExport tool. This is a command-line tool that does thefollowing:
    • Prepares your disk drives that are shipped for import.
    • Facilitates copying your data to the drive.
    • Encrypts the data on the drive with BitLocker.
    • Generates the drive journal files used during import creation.
    • Helps identify numbers of drives needed for export jobs.

Note: The WAImportExport tool is available in two versions, version 1 and 2. We recommend that you use: Version 1 for import/export into Azure Blob storage. Version 2 for importing data into Azure files.

  • Disk Drives. You can ship Solid-state drives (SSDs) or Hard disk drives (HDDs) to the Azure datacenter. When creating an import job, you ship disk drives containing your data. When creating an export job, you ship empty drives to the Azure datacenter.

Requirements

Operating systems

  • Windows Server 64-bit OS that supports BitLocker Drive Encryption.
  • Windows clients that have .NET Framework 4.5.1 and BitLocker.

Supported storage accounts

  • General Purpose v2 storage accounts (recommended for most scenarios)
  • Blob Storage accounts
  • General Purpose v1 storage accounts (both Classic or Azure Resource Manager deployments)

Supported storage types

  • Import jobs can include Azure Blob storage, Azure File storage, Blob blobs, and Page blobs.
  • Export jobs can include Azure Blob storage, Block blobs, Page blobs, and Append blobs. Azure Files not supported.

Supported disks

Disk type Size Supported Not supported
SSD 2.5″ All
HDD 2.5″ and 3.5″ SATA II, SATA III External HDD with built-in USB adaptor and disks inside the casing of an external HDD.

Import and Export Tool

The Microsoft Azure Import/Export Tool is the drive preparation and repairtool that you can use with the Microsoft Azure Import/Export service. You canuse the tool for the following functions:

  • Before creating an import job, you can use this tool to copy data to thehard drives you are going to ship to an Azure datacenter.
  • After an import job has completed, you can use this tool to repair anyblobs that were corrupted, were missing, or conflicted with other blobs.
  • After you receive the drives from a completed export job, you can usethis tool to repair any files that were corrupted or missing on the drives.

Import/Export service requires the use of internal SATA II/III HDDs or SSDs.Each disk contains a single NTFS volume that you encrypt with BitLockerwhen preparing the drive. To prepare a drive, you must connect it to acomputer running a 64-bit version of the Windows client or server operatingsystem and run the WAImportExport tool from that computer. TheWAImportExport tool handles data copy, volume encryption, and creation ofjournal files. Journal files are necessary to create an import/export job andhelp ensure the integrity of the data transfer.

 

What is a journal file?

Each time you run the WAImportExport tool to copy files to the hard drive,the tool creates a copy session. The state of the copy session is written to thejournal file. If a copy session is interrupted (for example, due to a systempower loss), it can be resumed by running the tool again and specifying thejournal file on the command line.

For each hard drive that you prepare with the Azure Import/Export Tool, thetool will create a single journal file with name DriveID.xml where DriveID isthe serial number associated to the drive that the tool reads from the disk. Youwill need the journal files from all of your drives to create the import job. Thejournal file can also be used to resume drive preparation if the tool isinterrupted.

 

Simple Import Example

WAImportExport.exe PrepImport /j:<JournalFile> /id:<SessionId> /DataSet:<dataset.csv>

  • PrepImport. Indicates the tool is preparing drives for an import job.
  • JournalFile. Path to the journal file that will be created. A journal filetracks a set of drives and records the progress in preparing these drives.The journal file must always be specified.
  • SessionId. The session Id is used to identify a copy session. It is used toensure accurate recovery of an interrupted copy session.
  • DataSet. A CSV file that contains a list of directories and/or a list of filesto be copied to target drives.

✔️ The WAImportExport tool is available from Microsoft Download site athttps://aka.ms/Welhs7.

 

Import Jobs

An Import job securely transfers large amounts of data to Azure Blob storage (block and page blobs) and Azure Files by shipping disk drives to an Azuredatacenter. In this case, you will be shipping hard drives containing yourdata.

Your job will be configured in the Portal. Notice the need for the journal file,created by the Import/Export tool, and a storage account to receive the data.Not shown is the return shipping information.

To perform an import, follow these steps:

  1. Create an Azure Storage account.
  2. Identify the number of disks that you will need to accommodate all thedata that you want to transfer.
  3. Identify a computer that you will use to perform the data copy, attachphysical disks that you will ship to the target Azure datacenter, andinstall the WAImportExport tool.
  4. Run the WAImportExport tool to copy the data, encrypt the drive withBitLocker, and generate journal files.
  5. Use the Azure portal to create an import job referencing the AzureStorage account. As part of the job definition, specify the destinationaddress representing the Azure region where the Azure Storage accountresides.
  6. Ship the disks to the destination that you specified when creating theimport job and update the job by providing the shipment trackingnumber.
  7. Once the disks arrive at the destination, the Azure datacenter staff willcarry out data copy to the target Azure Storage account and ship thedisks back to you.

 

Export Jobs

Export jobs transfer data from Azure storage to hard disk drives and ship toyour on-premise sites.

In order to perform an export, follow these steps:

  1. Identify the data in the Azure Storage blobs that you intend to export.
  2. Identify the number of disks that you will need to accommodate all thedata you want to transfer.
  3. Use the Azure portal to create an export job referencing the AzureStorage account. As part of the job definition, specify the blobs you wantto export, the return address, and your carrier account number.Microsoft will ship your disks back to you after the export process iscomplete.
  4. Ship the required number of disks to the Azure region hosting the storageaccount. Update the job by providing the shipment tracking number.
  5. Once the disks arrive at the destination, Azure datacenter staff will carryout data copy from the storage account to the disks that you provided,encrypt the volumes on the disks by using BitLocker, and ship them backto you. The BitLocker keys will be available in the Azure portal, allowingyou to decrypt the content of the disks and copy them to your on-premises storage.

 

AzCopy

An alternative method for transferring data is AzCopy. AzCopy v10 is thenext-generation command-line utility for copying data to/from MicrosoftAzure Blob and File storage, which offers a redesigned command-lineinterface and new architecture for high-performance reliable data transfers.Using AzCopy, you can copy data between a file system and a storageaccount, or between storage accounts.

What’s new

  • Synchronize a file system to Azure Blob or vice versa. Ideal forincremental copy scenarios.
  • Supports Azure Data Lake Storage Gen2 APIs.
  • Supports copying an entire account (Blob service only) to anotheraccount.
  • Account to account copy is now using the new Put from URL APIs. Nodata transfer to the client is needed which makes the transfer faster.
  • List/Remove files and blobs in a given path.
  • Supports wildcard patterns in a path as well as –include and –excludeflags.
  • Improved resiliency: every AzCopy instance will create a job order and arelated log file. You can view and restart previous jobs and resume failedjobs. AzCopy will also automatically retry a transfer after a failure.
  • General performance improvements.

Authentication options

  • Azure Active Directory (Supported for Blob and ADLS Gen2 services).Use .\azcopy login to sign in using Azure Active Directory. The usershould have Storage Blob Data Contributor role assigned to write toBlob storage using Azure Active Directory authentication.
  • SAS tokens (supported for Blob and File services). Append the SAStoken to the blob path on the command line to use it.

Getting started

AzCopy has a simple self-documented syntax. Here’s how you can get a list ofavailable commands:

AzCopy /?

The basic syntax for AzCopy commands is:

AzCopy /Source:<source> /Dest:<destination> [Options]

✔️ AzCopy is available on Windows, Linux, and MacOS.

For more information, you can see:

Download and install AzCopy on Windows – https://docs.microsoft.com/en-us/azure/storage/common/storage-use-azcopy#download-and-install-azcopy-on-windows

 

Offline – Use Cases

There are many scenarios where offline Data Box products can be used. Let’slook at one-time migration, incremental transfer, and periodic updates.

One-time migration. When large amounts of on-premises data is moved toAzure. Examples include:

  • Moving data from offline tapes to archival data in Azure cool storage.
  • Moving a media library from offline tapes into Azure to create an onlinemedia library.
  • Migrating your VM farm, SQL Server, and applications to Azure
  • Moving historical data to Azure for in-depth analysis and reporting,using HDInsight.
  • Moving backup data to Azure for offsite storage.

Incremental transfer. After an initial bulk transfer, moving new data. Forexample, backup solutions partners such as Commvault and Data Box areused to move initial large historical backup to Azure. Once complete, theincremental data is transferred via network to Azure storage.

Periodic uploads. When a large amount of data is generated periodically andneeds to be moved to Azure. For example, in energy exploration, where videocontent is generated on oil rigs and windmill farms.

For more information, you can see:

Video -Case Study: Azure Data Box | Oceaneering Intl –

 

Offline – Data Box Products

Data is being generated at record levels, and moving stored or in-flight datato the cloud can be challenging. Azure Data Box products provide both offlineand online solutions for moving your data to the cloud. In this topic we willconcentrate on the offline data products.

Offline solutions transfer large amounts of data to Azure where there islimited or no network bandwidth.

Data Box

  • 100 TB total capacity per order
  • 80 TB usable capacity per order
  • One device per order
  • Supports Azure Blob or Files
  • Copy data to 10 storage accounts
  • 1×1/10 Gbps RJ45, 2×10 Gbps SFP+ interface
  • Uses AES 256-bit encryption
  • Copy data using standard NAS protocols (SMB/NFS)

Data Box Disk

  • 40 TB total capacity per order
  • 35 TB usable capacity per order
  • Up to five disks per order
  • Supports Azure Blob
  • Copy data to one storage account
  • USB/SATA II, III interface
  • Uses AES 128-bit encryption
  • Copy data using Robocopy or similar tools

Data Box Heavy (Preview)

  • 1 PB total capacity per order
  • 800 TB usable capacity per order
  • One device per order
  • Supports Azure Blob or Files
  • Copy data to 10 storage accounts
  • 1×1/10 Gbps RJ45, 4×40 Gbps QSFP+ interface
  • Uses AES 256-bit encryption
  • Copy data using standard NAS protocols (SMB/NFS)

 

Offline – Product Selection

Data Box is designed to move large amounts of data to Azure with no impactto the network. When selecting an offline product consider speed and security.

Speed. Use the estimated speed to determine which box will transfer the datain the time frame you need. For data sizes < 40 TB, use Data Box Disk andfor data sizes > 500 TB, sign up for Data Box Heavy.

Product Network Interfaces
Data Box Disk USB 3.0 connection
Data Box 1 Gbps or 10 Gbps networkinterfaces
Data Box Heavy High performance 40 Gbps networkinterfaces

Security. All products can only be unlocked with a password provided in theAzure portal. All services are protected by Azure security features. Ensureyour selection meets your organization’s security requirements.

Product Physical security Encryption
Data Box Disk The disks are tamper-resistant and supportsecure updatecapability. The data is securedwith AES 128-bitencryption.
Data Box Rugged device casingsecured by tamper-resistant screws andtamper-evidentstickers. The data is securedwith AES 256-bitencryption.
Data Box Heavy Rugged device casingsecured by tamper-resistant screws andtamper-evidentstickers. The data is securedwith AES 256-bitencryption.

✔️ Once your data is uploaded to Azure, the disks on the device are wipedclean, in accordance with NIST 800-88r1 standards.

 

Offline – Implementation Offline Products

The implementation workflow is the same for Data Box, Data Box Disk, andData Box Heavy.

  1. Order. Create an order in the Azure portal, provide shippinginformation, and the destination Azure storage account for your data. Ifthe device is available, Azure prepares and ships the device with ashipment tracking ID.
  2. Receive, unpack, connect, and unlock. Once the device is delivered,cable the device for network and power using the specified cables. Turnon and connect to the device. Configure the device network and mountshares on the host computer from where you want to copy the data.
  3. Copy and validate the data. Copy data to Data Box shares.
  4. Return, upload, verify. Prepare, turn off, and ship the device back to theAzure datacenter. Data is automatically copied from the device to Azure.The device disks are securely erased as per the National Institute ofStandards and Technology (NIST) guidelines.

✔️ Take a few minutes to review each link. The links are for Data Box, thereare similar pages for Data Box Disk and Data Box Heavy.

✔️ Throughout this process, you are notified via email on all status changes.

 

Online – Data Box Gateway

Data Box Gateway

Data Box Gateway transfers data to and from Azure. It’s a virtual appliancebased on a virtual machine provisioned in your virtualized environment orhypervisor. The virtual device resides in your on-premises and you write datato it using the NFS and SMB protocols. The device then transfers your data toAzure block blob, page blob, or Azure Files.

Use cases

  • Cloud archival. Copy hundreds of TBs of data to Azure storage usingData Box Gateway in a secure and efficient manner. The data can beingested one time or an ongoing basis for archival scenarios.
  • Data aggregation. Aggregate data from multiple sources into a singlelocation in Azure Storage for data processing and analytics.
  • Integration with on-premises workloads. Integrate with on-premisesworkloads such as backup and restore that use cloud storage and needlocal access for commonly used files.

Benefits

  • Easy data transfer. Makes it easy to move data in and out of Azurestorage as easy as working with a local network share.
  • High speed performance. Takes the hassle out of network data transportwith high-performance transfers to and from Azure.
  • Fast access. Caches most recent files for fast access of on-premises files.
  • Limited bandwidth usage. Data can be written to Azure even when thenetwork is throttled to limit usage during peak business hours.

Features

  • Virtual device provisioned in your hypervisor
  • Storage gateway
  • Supports SMB or NFS protocols
  • Supports Azure Blob or Files
  • Supports Hyper-V or VMware

 

Online – Data Box Edge

Data Box Edge

This on-premises physical network appliance transfers data to and fromAzure. Analyze, process, and transform your on-premises data beforeuploading it to the cloud using AI-enabled edge compute capabilities. AzureData Box Edge is an AI-enabled edge computing device with network datatransfer capabilities.

Use cases

  • Pre-process data. Analyze data from on-premises or IoT devices toquickly get to results while staying close to where data is generated.Data Box Edge transfers the full data set to the cloud to perform moreadvanced processing or deeper analytics. Preprocessing can be used to:
    • Aggregate data.
    • Modify data, for example to remove Personally IdentifiableInformation (PII).
    • Subset and transfer the data needed for deeper analytics in thecloud.
    • Analyze and react to IoT Events.
  • Inference Azure Machine Learning. With Data Box Edge, you can runMachine Learning (ML) models to get quick results that can be acted onbefore the data is sent to the cloud. The full data set is transferred tocontinue to retrain and improve your ML models.
  • Transfer data over network to Azure. Use Data Box Edge to easily andquickly transfer data to Azure to enable further compute and analytics orfor archival purposes.

Benefits

  • Easy data transfer. Makes moving data in and out of Azure storage aseasy as working with a local network share.
  • High speed performance. Enables high-performance transfers to andfrom Azure.
  • Fast access. Caches most recent files for fast access of on-premises files.
  • Limited bandwidth usage. Data can be written to Azure even when thenetwork is throttled to limit usage during peak business hours.
  • Transform data. Enables analysis, processing, or filtering of data as itmoves to Azure.

Features

  • AI-enabled edge compute
  • Physical device shipped by Microsoft
  • Storage gateway
  • Supports SMB or NFS protocols
  • Supports Azure Blob or Files
  • 1U chassis, 2×10 core CPU, 64 GB RAM
  • 12 TB local NVMe SSD storage
  • 4×25 GbE network interfaces

 

Online – Implementation Online Products

The following steps outline the workflow for using Data Box Gateway:

  1. Prepare. Create and configure your Data Box Gateway resource prior toprovisioning a Data Box Gateway virtual device. This includes: checkingprerequisites, creating a new Data Box Gateway in the portal,downloading the virtual device image for Hyper-V or VMware, andobtaining the activation key. This key is used to activate and connectyour Data Box Gateway device with the resource.
  2. Provision. For Hyper-V, provision and connect to a Data Box Gatewayvirtual device on a host system running Hyper-V on Windows Server2016 or Windows Server 2012 R2. For VMware, provision and connectto a Data Box Gateway virtual device on a host system running VMwareESXi 6.0 or 6.5. For both hypervisors you will: verify requirements,provision the device, start the device, and get the IP address.
  3. Connect, setup, and activate. (https://docs.microsoft.com/en-us/azure/databox-online/data-box-gateway-deploy-connect-setup-activate). Connect to the local web UI setup page. Provide the devicename and activation key. The Network settings, Web proxy settings, andTime settings are optional.
  4. Add, connect to the share. (https://docs.microsoft.com/en-us/azure/databox-online/data-box-gateway-deploy-add-shares). Yourshare can be SMB or NFS. There are settings for both in the portal.Once the share is created you can connect and begin transferring data.

✔️ Be sure to view the documentation for each step,

✔️ The steps for Data Box Edge are the same with the addition of the IoTdevice.

For more information, you can see:

Tutorial: Prepare to deploy Azure Data Box Edge –https://docs.microsoft.com/en-us/azure/databox-online/data-box-edge-deploy-prep

 

References