Start a New Training Job on NetBook UI

How to run serverless experiments on NetBook

Here are the details you would be required to provide for NetBook to start training for you

  • GitHub code URL:

    • Public Repository

      • If your code is in a public repository, you can directly provide the code repo URL and the branch on which the training code is hosted.

    • Private Repository

      • We currently don't have any credentials-store support. But here is how you can provide your repo URL for us to access it.

      • git clone https://{username}:{access_token}@yourbitbucket.org/user/repo.git

      • You can generate the access_token from the accounts section from any of the git providers

  • Training Data Store:

    • You will find a drop-down to select the input datastore where your code is hosted. We support

      • AWS EBS Volumes

      • AWS EBS Snapshots

      • Azure Volumes

      • Azure Snapshots

    • If your training data is on an S3 bucket and your data loader code handles reading from S3 directly, you can select None in the dropdown

    • Mount Path

      • This is the path to which the volume will be mounted when we create instances for you. Ideally, this path should be similar to the input directory through which your data loader code is written

      • If your data loader code directly reads from URLs, this can be left blank

    • Volume/Snapshot ID

      • Provide the Volume ID or Snapshot ID for where your training data is stored

      • Please refer to Training Data Setup to get the Volume ID

    • Artifact Store Size

      • This will be the volume that NetBook creates to store your generated weight files or intermediary assets.

      • Minimum can be 1 GB. Make sure that the volume size you give should be more than the weight files size or else you will not be able to save your weight files

    • Artifact Mount Path

      • This will be a directory to which the Artifact store will be mounted.

      • Make sure the output directory to which you write the generated weight files is similar to it.

      • If your code directly writes weight files to S3, you can skip this section.

    • Node Selection

      • We currently support single Node training. You can select the instance on which you would like to train the model

    • Start Script

      • This will be the training start script you use to train your models. If you have your own training container, this should be similar to the entry point command you use.

    Once you start the experiment, NetBook handles the infrastructure handling and training process. You can check the running experiments and artifacts in the experiments tab on the platform.

Last updated