CloudWiki

Amazon Web Service (AWS)

SageMaker Notebook

Compute
A SageMaker notebook instance is a machine learning (ML) compute instance running the Jupyter Notebook App, a web-based interactive computing platform that allows editing and running notebook documents via a web browser. SageMaker is a fully managed machine learning service that allows data scientists and developers to easily build and train machine learning models and directly deploy them into a production-ready hosted environment.date your models.
aws_sagemaker_notebook_instance
SageMaker Notebook
attributes:
  • name - (Required) The name of the notebook instance (must be unique).
  • role_arn - (Required) The ARN of the IAM role to be used by the notebook instance which allows SageMaker to call other services on your behalf.
  • instance_type - (Required) The name of ML compute instance type.
  • platform_identifier - (Optional) The platform identifier of the notebook instance runtime environment. This value can be either notebook-al1-v1, notebook-al2-v1, or notebook-al2-v2, depending on which version of Amazon Linux you require.
  • volume_size - (Optional) The size, in GB, of the ML storage volume to attach to the notebook instance. The default value is 5 GB.
  • subnet_id - (Optional) The VPC subnet ID.
  • security_groups - (Optional) The associated security groups.
  • accelerator_types - (Optional) A list of Elastic Inference (EI) instance types to associate with this notebook instance. See Elastic Inference Accelerator for more details. Valid values: ml.eia1.medium, ml.eia1.large, ml.eia1.xlarge, ml.eia2.medium, ml.eia2.large, ml.eia2.xlarge.
  • additional_code_repositories - (Optional) An array of up to three Git repositories to associate with the notebook instance. These can be either the names of Git repositories stored as resources in your account, or the URL of Git repositories in AWS CodeCommit or in any other Git repository. These repositories are cloned at the same level as the default repository of your notebook instance.
  • default_code_repository - (Optional) The Git repository associated with the notebook instance as its default code repository. This can be either the name of a Git repository stored as a resource in your account, or the URL of a Git repository in AWS CodeCommit or in any other Git repository.
  • direct_internet_access - (Optional) Set to Disabled to disable internet access to notebook. Requires security_groups and subnet_id to be set. Supported values: Enabled (Default) or Disabled. If set to Disabled, the notebook instance will be able to access resources only in your VPC, and will not be able to connect to Amazon SageMaker training and endpoint services unless your configure a NAT Gateway in your VPC.
  • instance_metadata_service_configuration - (Optional) Information on the IMDS configuration of the notebook instance. Conflicts with instance_metadata_service_configuration. see details below.
  • kms_key_id - (Optional) The AWS Key Management Service (AWS KMS) key that Amazon SageMaker uses to encrypt the model artifacts at rest using Amazon S3 server-side encryption.
  • lifecycle_config_name - (Optional) The name of a lifecycle configuration to associate with the notebook instance.
  • root_access - (Optional) Whether root access is Enabled or Disabled for users of the notebook instance. The default value is Enabled.
  • tags - (Optional) A map of tags to assign to the resource. If configured with a provider default_tags configuration block present, tags with matching keys will overwrite those defined at the provider-level.

instance_metadata_service_configuration

  • minimum_instance_metadata_service_version - (Optional) Indicates the minimum IMDS version that the notebook instance supports. When passed "1" is passed. This means that both IMDSv1 and IMDSv2 are supported. Valid values are 1 and 2.

Associating resources with a
SageMaker Notebook
Resources do not "belong" to a
SageMaker Notebook
Rather, one or more Security Groups are associated to a resource.
Create
SageMaker Notebook
via Terraform:
The following HCL creates a SageMaker Notebook Instance resource
Syntax:

resource "aws_sagemaker_notebook_instance" "ni" {
 name          = "my-notebook-instance"
 role_arn      = aws_iam_role.role.arn
 instance_type = "ml.t2.medium"

 tags = {
   Name = "foo"
 }
}

Create
SageMaker Notebook
via CLI:
Parametres:

create-notebook-instance
--notebook-instance-name <value>
--instance-type <value>
[--subnet-id <value>]
[--security-group-ids <value>]
--role-arn <value>
[--kms-key-id <value>]
[--tags <value>]
[--lifecycle-config-name <value>]
[--direct-internet-access <value>]
[--volume-size-in-gb <value>]
[--accelerator-types <value>]
[--default-code-repository <value>]
[--additional-code-repositories <value>]
[--root-access <value>]
[--platform-identifier <value>]
[--instance-metadata-service-configuration <value>]
[--cli-input-json | --cli-input-yaml]
[--generate-cli-skeleton <value>]
[--debug]
[--endpoint-url <value>]
[--no-verify-ssl]
[--no-paginate]
[--output <value>]
[--query <value>]
[--profile <value>]
[--region <value>]
[--version <value>]
[--color <value>]
[--no-sign-request]
[--ca-bundle <value>]
[--cli-read-timeout <value>]
[--cli-connect-timeout <value>]
[--cli-binary-format <value>]
[--no-cli-pager]
[--cli-auto-prompt]
[--no-cli-auto-prompt]

Example:
Best Practices for
SageMaker Notebook

Categorized by Availability, Security & Compliance and Cost

Warning
Ensure Amazon SageMaker Notebook Instance is in VPC
Critical
Ensure SageMaker Notebook Data is Encrypted
Warning
Ensure SageMaker Notebook Direct Internet Access is disabled
Explore all the rules our platform covers
All Resources