CloudWiki
Resource
Get a free AWS Well-Architected Assessment ->

Microsoft Azure

Data Lake Storage

Azure Data Lake Storage is a cloud-based storage solution that is optimized for big data analytics workloads. It is designed to store and process large amounts of structured and unstructured data at scale and can be used for a variety of data processing scenarios, such as batch processing, real-time stream processing, and interactive querying. Azure Data Lake Storage is built on top of Azure Blob Storage and provides additional features and capabilities specifically for big data workloads. It supports the Hadoop Distributed File System (HDFS) interface, making it compatible with existing Hadoop-based big data tools and applications. It also provides a hierarchical namespace that allows for more efficient data access and management, and supports distributed access control for fine-grained access control and auditing.‍
aws cost
Costs
Direct Cost
Indirect Cost
No items found.
Terraform Name
terraform
azurerm_storage_data_lake_gen2_filesystem
Data Lake Storage
attributes:

The following arguments are supported:

  • name - (Required) The name of the Data Lake Gen2 File System which should be created within the Storage Account. Must be unique within the storage account the queue is located. Changing this forces a new resource to be created.
  • storage_account_id - (Required) Specifies the ID of the Storage Account in which the Data Lake Gen2 File System should exist. Changing this forces a new resource to be created.
  • properties - (Optional) A mapping of Key to Base64-Encoded Values which should be assigned to this Data Lake Gen2 File System.
  • ace - (Optional) One or more ace blocks as defined below to specify the entries for the ACL for the path.
  • owner - (Optional) Specifies the Object ID of the Azure Active Directory User to make the owning user of the root path (i.e. /). Possible values also include $superuser.
  • group - (Optional) Specifies the Object ID of the Azure Active Directory Group to make the owning group of the root path (i.e. /). Possible values also include $superuser.

NOTE:

The Storage Account requires account_kind to be either StorageV2 or BlobStorage. In addition, is_hns_enabled has to be set to true.

An ace block supports the following:

  • scope - (Optional) Specifies whether the ACE represents an access entry or a default entry. Default value is access.
  • type - (Required) Specifies the type of entry. Can be user, group, mask or other.
  • id - (Optional) Specifies the Object ID of the Azure Active Directory User or Group that the entry relates to. Only valid for user or group entries.
  • permissions - (Required) Specifies the permissions for the entry in rwx form. For example, rwx gives full permissions but r-- only gives read permissions.

More details on ACLs can be found here: https://docs.microsoft.com/azure/storage/blobs/data-lake-storage-access-control#access-control-lists-on-files-and-directories

Associating resources with a
Data Lake Storage
Resources do not "belong" to a
Data Lake Storage
Rather, one or more Security Groups are associated to a resource.
Create
Data Lake Storage
via Terraform:
The following HCL manages a Data Lake Gen2 File System within an Azure Storage Account
Syntax:

resource "azurerm_resource_group" "example" {
 name     = "example-resources"
 location = "West Europe"
}

resource "azurerm_storage_account" "example" {
 name                     = "examplestorageacc"
 resource_group_name      = azurerm_resource_group.example.name
 location                 = azurerm_resource_group.example.location
 account_tier             = "Standard"
 account_replication_type = "LRS"
 account_kind             = "StorageV2"
 is_hns_enabled           = "true"
}

resource "azurerm_storage_data_lake_gen2_filesystem" "example" {
 name               = "example"
 storage_account_id = azurerm_storage_account.example.id

 properties = {
   hello = "aGVsbG8="
 }
}

Create
Data Lake Storage
via CLI:
Parametres:

az dls fs create --account
                --path
                [--content]
                [--folder]
                [--force]

Example:

az dls fs create --account {account} --folder  --path {path}

Best Practices for
Data Lake Storage

Categorized by Availability, Security & Compliance and Cost

No items found.
Explore all the rules our platform covers
Related blog posts