For this assignment, I was really inspired to create a model where I can train an image classifier with images of Pokemon, and then use that model on human faces. The output would be which Pokemon the human face looks most like. I thought it be a fun, whimsical project to work on since the outcomes would be quite interesting( and, funny) to observe.
The first step was to arrange the dataset. This was much easier than I thought. I stumbled across a zip file on Kaggle with over 800 PNG files of Pokemon. After downloading it, I cleaned out the unnecessary files and and left only the first 721 original Pokemon. Since I needed labels too, I was able to find that as well on a Github Gist.
Once I matched the labels and the images, I attempted to train an existing model with the help of ml5’s “feature extractor”. (According to the ml5js documentation, there is no method to train a classifier from scratch as of yet so I had to retrain an existing model)
This is the stage where I got stuck at and wasn’t able to make much progress in because of reasons described in the post below.
To get started, I followed the steps outlined in this link, which outlines the gist of how to re-train a model with custom images.
The steps seemed fairly straightforward.
1) Get the features from ‘MobileNet’ using ml5’s feature extractor, and create a classifier out of it.
2) Add all the new images to the classifier.
3) Train the classifier with the new images.
From this point onwards, it was a mix of dealing with Javascript-related, and several logic errors in ml5js workflow.
My approach was to save all the images into a file, and then read them off the file into an array of images. Same goes for the labels.
This was my initial piece of code. (Posting this here to highlight some of the early mistakes).
let text;
let features = ml5.featureExtractor('MobileNet');
let classifier = features.classification();
function preload(){
text = loadStrings("../pokemonList.txt");
}
function setup() {
createCanvas(1280, 720);
console.log("Gonna add images now");
for(var i=1; i<=text.length;i++){
img = new Image();
img.src = "pokemon/"+i.toString()+".png";
//we need to add each pokemon image in succession
classifier.addImage(img, text[i-1], imageAdded);
}
console.log("done training");
}
function imageAdded()
{
console.log("Done addding images, gonna start training now. might take some time?");
console.log("Image added, training model..");
classifier.train();
}
function draw() {
// console.log(text);
console.log(text.length);
noLoop();
}
The initial error was trying to load the images at once and then attempting to classify them all in one go. There are multiple problems associated with trying to do this. First of all, because of the way Javascript works, the program might not wait for the image to completely load to save it. The solution(?) was to move the loading images segment to the preload() function instead. (However, even though the images seemed to be loaded successfully over, the tensorflow.js model returns errors relating to null pixel values). Thus, it tries to move onto the next function/task even though the current/previous task has not completed yet. Secondly, since I was using p5js, there was a more appropriate syntax to load an image which is literally a function called loadImage(). Then, I was attempting to “bulk train” the model as described on the ml5js page. This would cause all sorts of erratic behavior including error messages such as “Uncaught (in promise) Error: Add some examples before training!”,”ValueError: Input arrays should have the same number of samples as target arrays”, and several graphics related errors such as failures to compile fragment shaders, and so on. After a quick Google search, I found out that it is best to train the classifier one image at a time, through a callback function that invokes the train() function. Lastly, there was an issue with using the value of text.length as the for loop boundary. For some reason, the value would appear to be zero in the setup() loop, and would only appear to be 721 in the draw() loop even though the textfile was loaded and read in the preload() function. The solution to this was simply hardcoding the value 721 in the loop until I found out exactly what went wrong and why.
Making a couple of tweaks to the code in terms seemed to load all the images. (this was verified by trying to invoke the image() function on every image in the draw loop and seeing if they actually displayed on the screen. They did.)
The next step was to train the model, but unfortunately, I seemed to have reached another roadblock with this.
The program loads all the images, runs through the train() function, but it triggers an error in the the underlying tensorflow.js framework. This in turn seems to propagate the error all the way upto ml5 again.
My presumption is that even though the for loop that loads the images runs through, the images are not really loaded just yet and ready to be read into the classifier.
Afterwards, to check if the images were actually loaded, the images were displayed before running it through the .train(). They displayed successfully. Additionally, the callback to the imageAdded() was removed and we tried to train the images in bulk. This created the same error.
For now, there seems to be little documentation with how to solve issues like this, but I enjoyed this entire process (although it was and is frustrating to not be able to finish it on time). But, at the same time, I feel this project helped me learn so much more about the Javascript workflow, and the unexpected behaviour that could come with it, along with some useful p5js functions. The most useful thing that I learnt was to structure your code in such a way that there is enough time for the images to load completely and be cached before any sort of logic can be applied to them.
If anyone has any ideas or suggestions as to how I might be able to proceed with this little project, please send them this way!
The code can be found here.