It was fun to play with Watson Studio and create my own convolutional neural network. Watson Studio is complex but easy and graphically intuitive tool, and integrates well with Watson Machine Learning engine. I used standard MNIST data set (with images of hand-written digits), created three layers deep CNN, exported the algorithm in TensorFlow / Keras, trained the model both using GPUs on Watson and locally on my Jupyter Python Macbook environment and got nice results. To data scientists and AI developers : go and use it. And for all other enthusiasts, the best way to learn AI is to create it yourself 🙂 !
First, to access Watson environment, you need your IBM ID. Visit bluemix.net and create your free account. Most of the cloud and cognitive services are for free, so you can learn, develop your models and deploy them for productive use.
Once you have the access to Watson platform, you will need to source three services: Cloud Object Storage (to store your data sets, trained models and training results), Machine Learning (to train your models and benefit from ultrafast GPUs) and Watson Studio (to create and manage your DL models).
When this is done, you will have the following items in your Watson Dashboard :
The next step is to create buckets in your Cloud Object Storage (COS) and upload your data. Recommendation is to create two buckets (you can also call them containers for your data objects), one for the data and one for the models and results. For larger models, more granularity is needed, since the Watson Machine Learning (WML) will load the data from the training data bucket, so you want to avoid unnecessary waiting time and memory consumption for the data which is not used in your training runs.
Here is how it looks like when you create both buckets and upload the data:
When this is done, you can launch Watson Studio and create your project. I named mine MNIST-LZRVC. You will see on the right hand side your Cloud Object Storage (COS). Watson Studio will automatically create connection to your COS service.
Next, in the assets tab of Watson Studio, go to the Modeler Flow and add a new flow. Name it like your project but with some unique id so that you can recognize when you start modifying and experimenting with different architectures.
These building blocks allow you to construct the whole deep neural structure from the scratch in a very easy way. Of course, you need to know what you are doing, but there are quite a lot of very good learning resources on the web if you want to learn more. In my particular case of this image recognition example, I have connected my input data with three convolutional + ReLU activation + pooling layers, then to a fully connected layer and passed the resulting feature representation to softmax classifier with 10 categories (for 10 different digits). In the end, I connected everything with the cross-entropy loss function and used Adam adaptive learning-rate optimizer. I’m measuring the performance of my model on the validation data set using the accuracy metric.
I have experimented with different number of layers. One is clearly not enough, but three layers give very good results, even after only two or three training epochs. Of course, you need to be careful with the hyperparameters like number and sizes of filters, strides, channels, etc.
When you have done this the most creative part of your project, connect your model with the COS bucket that contains your data set. Click on the three dots of upload your files in this new data bucket. I’m using pickled data in three separate files for training, validation and test.
When your model is ready, take a look at the two arrow-like icons in the toolbar. They allow you to download the code or launch your model.
When you select this download button you will get the choice of different formats. Pay attention that with Watson Studio you can create training code for multiple deep learning frameworks. This is the beauty – using IBM’s Fabric for Deep Leaarning (FfDL), you are able to run your deep learning algorithms on TensorFlow (with or without Keras), PyTorh and Caffe, all that within one unified training and deployment cloud environment.
So, I continue to download my model. I’m interested to see how it is written.
And here it is, I’m running it locally in my laptop python. No changes to the code needed !
I also like very much Jupyter notebook environment, so I just copy-pasted my code into it. If you would like to check it out, please download all files from my github repository.
I’m adding here a couple of screenshots to show the training and prediction with this Jupyter notebook. One training epoch takes 2 minutes, and I’m getting 96% of accuracy on the included test data set !
Here you can see the structure of the model :
And here is the prediction part. For the previously unseen images, I’m opening them in grayscale, adding the contrast, normalizing, resizing and reshaping them. Then, I use model.predict() method on this image and extract the predictions from the classifier. This is an array of 10 probabilities, where in the end we select the highest probability for the image to deliver our prediction.
Note : I noticed inaccurate predictions for number 7, when I write it with a dash across the numeral, and for all numbers when they are written with thin pencil. The training data seems to be of a similar type, so the real-world results are a bit skewed.
Now, going back to Watson Studio. I want to upload my model directly from the design canvas to Watson Machine Learning engine.
First, we will need to associate WML to our project. Select your project, and Setting menu tab option. Scroll down to see the Associated services, and add Machine Learning
We will now create an experiment (in other words, prepare to launch the training run):
Enter a name of your training run (mind using ids and some meaningful abbreviations so that you can trace back where the results are coming from). Select and conform your COS buckets for the input data and results, and create your training definition to choose the number of GPUs and if you would like to do hyperparameter optimization (I will describe this in one of the next posts in the future).
When you click on Create and Run, the WML will automatically start your training process. You can monitor the progress, review the log files as the system works. There is some delay until your job lands on the GPUs, and here after 5 minutes 25 seconds I got the results. This elapsed time is actually not bad, having in mind that I left 10 epochs specified in my model definition. One training epoch took 18 seconds, and there was also some queuing of my job before it started.
Here in the results page you can see the location of your trained model. In the next blog post I will write something about other more sophisticated ways how WML can be used for your Deep Learning projects.
Watson Studio, Watson Machine Learning and Cloud Object Storage are, as you could see from this tutorial, very sophisticated, well integrated, extremely flexible and easy-to-use tools for data scientists and AI developers. DLaaS is an excellent feature of IBM Watson platform that one can use in different ways.
Hope this tutorial was easy to follow. If you have any questions , please connect with me on LinkedIn to initiate the discussion.
May 21, 2018
PS: Here is again the link to my Github repository :