How to build visual traffic analytics with open source: car tracking and license plates recognition
Jan 11, 2022 • 12 min read
Jan 11, 2022 • 12 min read
The manufacturing industry is under immense pressure to adopt cutting-edge technology solutions in order to remain competitive in an increasingly digital and on-demand economy. You have to be faster, smarter and more agile to meet customer and industry demands, and process efficiency is fundamental to achieving this. Tackling this challenge involves reimagining every aspect of the manufacturing process - from engineering technology and supply chain logistics right through to inventory management and small part replacement accessibility. You might think that the function of finding small replacement parts is insignificant in terms of long term business goals, but if you’re the Director of Engineering for a manufacturing company, you understand how important it is to optimize and accelerate this process in order to innovate, create and add business value more dynamically and holistically.
In this article we will show you how we helped a forward-thinking manufacturing company optimize their replacement part discovery process with a universal visual search solution.
The process of engineering a robust mechanical product, whether it's an escalator or a car engine, requires many small parts. We accept that these parts wear out over time and require replacement to avoid breakdowns and to keep the mechanics of the product running smoothly.
Correctly identifying these small manufacturing parts in a search catalog can, however, be very challenging. Microcircuits and chips, bolts and screws, pins and fuses, plus many other critical small parts can differ in minor details that are not visible to the naked eye - creating hundreds of patterns with multiple products for each pattern. Furthermore, manually searching for the correct part in a complex catalog of thousands of parts - that may not have visible serial numbers, only contain a label or are difficult to differentiate - adds to an already frustrating and endlessly time-consuming process for maintenance engineers.
We were approached by a manufacturing company who faced just such a problem. Their existing part-finding process involved maintenance engineers searching for parts in a catalog by submitting photos of the parts they needed or using a limited keyword search. The results were inefficient and often required additional company resources and time to find the correct parts to order.
They needed a solution to find small replacement parts quickly and efficiently using image, optical character recognition and extended keyword-based search, regardless of the photo of the part or its packaging. And of course, we were up for the challenge!
During our analysis of the data that the client shared with us, we found a mix of photos of the parts themselves, photos of packages or only product labels. Serial numbers or easily distinguishable characters were clearly visible in some photographs, but not in all of them.
One of the primary challenges we faced, therefore, was dealing with the differences between the photos the engineers were submitting compared to the images in the search catalog. For example, there were examples of visually indistinguishable images where only the model number differentiated the part, photos of a sticker with a serial number instead of an object itself, rulers alongside objects in photos to indicate scale, and drawings of the part in the catalog instead of photos.
In conclusion, we needed to build a universal solution capable of processing and recognizing any type of data, while at the same time satisfying the following requirements:
To meet these solution requirements, we decided to use a combination of several technologies, including visual search, optical character recognition (OCR) and optimized keyword search.
Next, we’ll discuss these technologies in more detail to understand how they work to create an effective visual search solution for manufacturing parts.
A typical Visual Search system is based on a visual similarity model - neural network architecture, such as CNN or Vision Transformer, that can process an image, extract its key features and learn to differentiate one from the other. The specially trained visual similarity model represents each product as an encoded feature vector in a multi-dimensional vector space, where the distance between feature vectors determines the relevancy of product matches. In other words, based on euclidean or cosine distance (K-nn Algorithm), the nearest feature vectors to the input image are returned as the most relevant search results.
Read more about visual search:
For this use case we implemented the CNN model based on ResNeXt architecture (ResNeXt-50 (32×4d)) pre-trained on an ImageNet dataset. However, the manufacturing parts we were dealing with were not adequately available in the pre-trained dataset, which meant we had to enhance the training dataset with about 10 000 independently sourced manufacturing part images along with the client-supplied labeled dataset.
With the relevant datasets, we trained a bespoke deep learning visual similarity model using ArcFace, or Additive Angular Margin Loss - which automatically constructs a task-specific distance function, showing the measure of similarity between two objects.
The Additive Angular Margin (ArcFace) loss formula is as follows:
With this technique we were able to improve the discriminative power of the model and stabilize the training process which involved fine-tuning the higher-order feature representations in the base model in order to make them more relevant for the specific task:
The result: a visual model that was able to cluster images of the same product together for more accurate search results:
Our search solution design doesn’t end here though. Read on to find out how we dealt with searches based on text recognition, as opposed to the image of the actual products.
With the visual search model nailed, we still needed to tackle the issue of searching for parts based only on product packaging or label inputs. The visual search model is not capable of identifying text or numbers, so a visual search of a product label with a serial number will only produce visually similar images without taking into account the actual serial number:
This is where OCR enters the search design. An OCR model is able to identify keywords such as serial number, batch number, manufacturer's brand, and then search for these words in the catalog.
In our case we used a cloud-based OCR solution that met all of our requirements. Taking into account that some images may be rotated incorrectly (making it difficult to accurately identify the characters), we included an image preprocessing technique to rotate the image 4 times by 90 degrees, run the character recognition process for each side, calculate the average score for each side, and then select the option with the highest score.
As you can see from the images below, we were able to find matching keywords on packages or labels and accurately identify the correct product from the catalog:
Having resolved the visual search and OCR search requirements separately, the next steps involved homogenizing the two methods into a unified solution, and extending the keyword functionality with Elasticsearch for more robust, multi-faceted search capabilities.
Finally, we added keyword filtering capabilities to power even more relevant search results using an Elasticsearch engine together with an Open Distro plugin. This allowed us to perform a K-nn (nearest neighbor) search using bool query and boost coefficients to rank images according to two boolean clauses: “Must” match visual similarity and “Should” match OCR.
This means that the application performs visual search in parallel to detecting keyword text on the image, and then uses both scores to rank the results accordingly:
Taking this Bool more-matches-is-better approach means that the most relevant results are boosted higher in the search results.
The system now performs combined search by CNN feature extractor, OCR model and additional keywords, with the assumption containing the keyword weighted with an additional boost coefficient.
Watch this Visual Parts Finder demo video to see the final solution in action!
Combining these three search technologies enables maintenance engineers to perform multi-faceted searches that return the most accurate results:
The speed and accuracy of this visual parts finder search solution holds massive value for manufacturing companies competing in an aggressively fast-paced industry. Not only can you expect a marked increase in productivity from your maintenance engineers, but also a significant decrease in the cost of support resources for the part finding function, and most importantly, more time to focus on driving innovation in your industry.
If you’re interested in learning more about visual search solutions for the manufacturing industry, get in touch with us to discuss your specific needs.