0

Comment GEU's course

Lecture 0: Build image classification application with Flask

Tài nguyên

Comment

Slide

  • Slide page 7 should add softmax class at the end of figure to represent the prediction probability

Code ML

  • You should configure random_seed so that others can reproduce the same results and be consistent in each run. You can check the code below:
import os
import random

import numpy as np
import torch


# Setup random seed
def set_seed_everything(seed: int):
    random.seed(seed)
    os.environ["PYTHONHASHSEED"] = str(seed)
    np.random.seed(seed)
    torch.manual_seed(seed)
    torch.cuda.manual_seed(seed)
    torch.backends.cudnn.deterministic = True
    torch.backends.cudnn.benchmark = True
  • You should add the visualization data function to give more insight into the dataset.
  • It would help to split the original dataset into train_set, validation_set, and test_set. The current version only uses 1 batch of test_set to test model accuracy after each epoch. *It would help if you used the confusion matrix as the metrics instead of using accuracy only

Code Flask

  • In this line 85. It would be best if you didn't load the model in the result function. I think you should initialize the model as a global variable when you start the application. In the current version, the model will be re-initialized each time you call the result function, making it low efficient.
  • You can read the binary directly from Flask's request if you don't need to store the image. You refer to this link will not take time to write and read files to the drive.
  • If still writing to the storage, there should have a function to validate the file size before saving.

Lecture 1: Create ML projects with AutoML

Tài nguyên

Comment

  • You followed the available tutorials using pycaret library, so there is nothing to comment.

Lecture 2: Natural language processing using BERT

Slide

  • In the BERT example for JLPT fill-blank example, I think you should add the theory of Beam Search (based on Conditional Probability) to slide
  • In the Classification of Japanese sentences. I think you should add the text preprocessing part in your pipeline. Preprocessing maybe include: removing out-of-distribution data (other language characters, math type), normalization (converting number to string, time to string...), replacing some special character
  • The brief explanation about the Transformer architecture maybe helpful for understanding.

Code

  • I don't have any comment in your code because it follow the standard example in transformers library.

Lecture 3: Computer Vision with CNN

Slide

  • I think you should talk about the traditional filter (Sobel, Gaussian, Laplacian..) before diving into CNN. This gives the student an overview of the intuition in the back of CNN.
  • I think you should add some slides about the data augmentation and embedding method. That may help the student to understand your example code easier.

Code

  • I think you should follow a coding convention. In your code, some function is indented with two spaces, another is four spaces

All rights reserved

Viblo
Hãy đăng ký một tài khoản Viblo để nhận được nhiều bài viết thú vị hơn.
Đăng kí