Alibaba Cloud Machine Learning Platform for AI: Image Classification by Caffe

Join us at the Alibaba Cloud ACtivate Online Conference on March 5–6 to challenge assumptions, exchange ideas, and explore what is possible through digital transformation.

By Garvin Li

The Image classification by Tensorflow section introduces how to use the TensorFlow framework of deep learning to classify CIFAR-10 images. This section introduces another deep learning framework: Caffe. With Caffe, you can complete image classification model training by editing configuration files.

Make sure that you have already read the Deep Learning section and activated deep learning in Alibaba Cloud Machine Learning Platform for AI (PAI).

Datasets

This experiment uses a CIFAR-10 open-source dataset, containing 60,000 images with pixel dimensions 32 x 32. These images are classified into 10 categories: airplanes, automobiles, birds, cats, deer. dogs, frogs, horses, ships, and trucks. The following figure shows the dataset.

The dataset has already been stored in the public dataset in Alibaba Cloud Machine Learning Platform for AI in JPG format. Machine learning users can directly enter the following paths in the Data Source Path field of deep learning components:

  • Testing data: oss://dl-images.oss-cn-shanghai-internal.aliyuncs.com/cifar10/caffe/images/cifar10_test_image_list.txt
  • Training data: oss://dl-images.oss-cn-shanghai-internal.aliyuncs.com/cifar10/caffe/images/cifar10_train_image_list.txt

Enter the path, as shown in the following figure:

Format Conversion

The Caffe framework of deep learning currently only supports certain formats. Therefore, you must first use the format conversion component to convert the JPG images.

  • OSS Path Storing Images and Table Files: set this parameter to the path of the public dataset predefined in Alibaba Cloud Machine Learning Platform for AI.
  • Output OSS Path: user-defined OSS path.

After format conversion, the following files are generated in the output OSS path, including a piece of training data and a piece of testing data.

Record the corresponding paths for editing the Net file. The following is an example of the data paths:

  • Training data data_file_list.txt: bucket/cifar/train/data_file_list.txt
  • Training data: data_mean.binaryproto:bucket/cifar/train/data_mean.binaryproto
  • Testing data data_file_list.txt: bucket/cifar/test/data_file_list.txt
  • Testing data: data_mean.binaryproto:bucket/cifar/test/data_mean.binaryproto

Caffe Configuration Files

Enter the preceding paths in the Net file, as follows:

Edit the Solver file:

Run the Experiment

  1. Upload the Solver and Net files to OSS, drag and drop the Caffe component to the canvas, and connect the component to the data source.
  2. Set the parameters in the Caffe component, as shown in the following figure. Set the Solver OSS Path to the OSS path of the uploaded Solver file and then click Run.
  3. View the generated image classification model file in the model storage path on OSS. You can use the following models to classify images.

  1. To view the corresponding log, refer to Logview in Image classification by TensorFlow.

Reference:https://www.alibabacloud.com/blog/alibaba-cloud-machine-learning-platform-for-ai-image-classification-by-caffe_594519?spm=a2c65.12602511.0.0

Real-Time Object Tracking

In order to develop a robust perception model, we can not always rely on object detection for the following two reasons:

  1. The speed of object detection is not exactly real-time. Though there are a lot of tiny models that cater to this problem, we couldn’t reach a stage where we can achieve this around 100 FPS.
  2. In order to take the action or to do path planning, we just not need to know the current state of the object but also the future states.

Re3 : Real-Time Recurrent Regression Networks for Visual Tracking of Generic Objects

Tracking using Re3

Robust object tracking requires knowledge and understanding of the object being tracked: its appearance, its motion, and how it changes over time. A tracker must be able to modify its underlying model and adapt to new observations. We present Re3, a real-time deep object tracker capable of incorporating temporal information into its model. Rather than focusing on a limited set of objects or training a model at test-time to track a specific instance, we pretrain our generic tracker on a large variety of objects and efficiently update on the fly; Re3 simultaneously tracks and updates the appearance model with a single forward pass. This lightweight model is capable of tracking objects at 150 FPS, while attaining competitive results on challenging benchmarks. We also show that our method handles temporary occlusion better than other comparable trackers using experiments that directly measure performance on sequences with occlusion.

TensorFlow implementation can be found at https://gitlab.com/danielgordon10/re3-tensorflow

Caffe Implementation

Download Caffe SnapShot: https://drive.google.com/open?id=11C0_LRlvhzq5-0pweCJmUbk71N-dwV8h

Re3: Feed Forward Network Architecture

Test the network with the following piece of code:

In this piece of code, we just test the output dimensions of the network. We didn’t measure the performance of the network.