How to recognize coins with deep learning visual model
Jan 14, 2021 • 9 min read
Jan 14, 2021 • 9 min read
In the modern digital world, we still rely on manual data entry for many important business processes. Needless to say, this leads to slower data processing, frustrated employees and costly errors which could be avoided with automated reliable data entry and processing. Often, even the data which was machine-generated on one end of the business transaction has to be human-read and entered on another end.
Such situations are pretty common in manufacturing, lab, logistics and even healthcare industries, where people have to deal with standard, labeled equipment and materials. Wouldn't it be nice to automate visual recognition of such labels and avoid corresponding mistakes? Such solutions do exist, yet they rely on industrial hardware, custom computer vision algorithms, are very specialized and as a result - quite costly.
Is it possible to democratize industrial label recognition? After all, even smartphones these days have high quality cameras and enough processing power to make all kinds of neat AI tricks. In this blog post we will describe how modern cloud AutoML platforms make it easy to build industrial label recognition system from scratch.
One of our customer is running large scale electronics manufacturing operations, and all kinds of components are received in bulk and are handled with manual data entry by warehouse staff. Cost of error in this receiving process is very high, which leads to slow processing and double-checking, thus speed and reliability of overall process is of high importance.
First of all, let's look at the anatomy of the typical industrial label
There is no well-defined standard for industry labels, but they all share many important elements which have to be recognized by the system. Typically they contain short text description of material duplicated as 1D or 2D barcode. Very often they contain company logo and one or many special pictograms which denote compliance with some manufacturing standard or hazardous material used.
One my wonder why simple barcode recognition is not enough for solving our data entry problem? There are plenty of barcode scanners on the market, and open source libraries are widely available. Well, first of all, in real life those labels look a lot like these:
Barcodes can be damaged or partially obscured by other labels during transportation. Also, some of the important data may not be encoded in the barcodes. This is why it is also important to recognize text and pictograms. Getting the same data about the material from multiple sources greatly improves quality and reliability of recognition through mutual corroboration.
Now, lets look into how build label recognition system with the help of cloud AI services and AutoML
When analyzing images of our materials we have to solve several separate computer vision problems. Each problem has its own optimal approach and solution. Some of those problems, such as barcode reading, are based on algorithmic approaches, while many others require machine learning approach.
Lets look at each of the analysis steps:
So, we should whip up our trusty Jupyter notebook, load some Tensorflow or Pytorch, get couple of GPU boxes and start coding away... Or should we? After all, we are not trying to approach new unsolved problem of push the boundaries of possible with novel state-of-the art ML model. We are trying to solve a business problem which boils down to a set of pretty standard ML tasks. Turns out, there are AutoML platforms out there which make creation of models based on well-understood ML tasks a breeze.
AutoML stands for "automated machine learning", so let's take a quick look at typical data science process and how parts of it can be automated by AutoML platform and what are the associated tradeoffs.
As you can see from the diagram above, AutoML platform gives a nice balance between complexity and control which is ideal for solving typical ML problems.
Every large cloud provider these days has its own AutoML solution which targets variety of ML applications in computer vision, natural language processing, predictive analytics on structured data, time series forecasting and more.
For this project, we will focus on Google's Cloud AutoML, in particular AutoML Vision for creating our custom models.
First of all, we will train our label recognition model. Turns out that even a few dozens of labeled images can bring model quality to over 90% of IoU.
Next, we will train the model which will recognize the kind of the label and extract company logo. Again, with only couple of hundreds of images in the dataset we can get over 99% company-label type accuracy.
It is equally easy to train pictogram recognition model:
Once we identified label type we can apply known layout and extract all important data items. After that, we can use Google OCR and barcode reading services to extract the information from text and barcodes.
Here is how the overall workflow looks like:
Industrial label recognition workflow
It is worth mentioning that once recognition step is complete it is important to perform cross-checking of the data extracted from text, barcode and pictograms. Any discrepancies have to be automatically resolved or reported to the user.
Once service is up&running we can create a end user label recognition on desktop :
In this blog post, we showed how to build a simple yet functional industrial label recognition system leveraging power of AutoML to train custom computer vision models.
AutoML allows to greatly speedup and democratize training of the models for standard computer vision tasks. So, before embarking on creation and training of customized deep learning model for image processing, it worth to train a quick AutoML model to establish a solid baseline to improve upon.