Quick Start¶
This document provides a simple guide for users to use the DCE 5.0 AI Lab platform for the entire development and training process of datasets, Notebooks, and job training.
-
Click Data Management -> Datasets in the navigation bar, then click Create. Create three datasets as follows:
- Code: https://github.com/d-run/drun-samples
- For faster access in China, use Gitee: https://gitee.com/samzong_lu/training-sample-code.git
- Data: https://github.com/zalandoresearch/fashion-mnist
- For faster access in China, use Gitee: https://gitee.com/samzong_lu/fashion-mnist.git
- Empty PVC: Create an empty PVC to output the trained model and logs after training.
Note
Currently, only
StorageClass
withReadWriteMany
mode is supported. Please use NFS or the recommended JuiceFS. - Code: https://github.com/d-run/drun-samples
-
Prepare the development environment by clicking Notebooks in the navigation bar, then click Create. Associate the three datasets created in the previous step and fill in the mount paths as shown in the image below:
-
Wait for the Notebook to be created successfully, click the access link in the list to enter the Notebook. Execute the following command in the Notebook terminal to start the job training.
-
Click Job Center -> Jobs in the navigation bar, create a
Tensorflow Single
job. Refer to the image below for job configuration and enable the Job Analysis (Tensorboard) feature. Click Create and wait for the status to complete.- Image address:
release.daocloud.io/baize/jupyter-tensorflow-full:v1.8.0-baize
- Command:
python
- Arguments:
/home/jovyan/code/tensorflow/tf-fashion-mnist-sample/train.py
Note
For large datasets or models, it is recommended to enable GPU configuration in the resource configuration step.
- Image address:
-
In the job created in the previous step, you can click the specific job analysis to view the job status and optimize the job training.