In our last post in this series, we discussed the top 10 computer vision use cases in digital retail. In this post, we discuss how you would go about sourcing a computer vision (CV) solution, from acquiring the skills you need from the market or a services firm to using a platform or buying a vertical solution. We discuss the most critical factors guiding the sourcing decision, the solutions on the market, and how these come together to help you make the best choice for your business.
Before jumping into decision points, we should consider some of the background factors that govern these decisions:
Custom or pre-packaged? If you have unique requirements or want to own the application yourself, building it is best for you. If you need it yesterday and have reasonably standard requirements, buying may be your answer.
Cost and value considerations should include both development costs as well as the total cost of ownership. Some pre-packaged solutions may consist of ongoing licenses, which may shift development costs to operational costs. In general, the build option only incurs upfront development costs, while the buy option incurs licensing costs throughout the lifetime of the project. The lowest price may not be the best value.
Consider lock-in vs. flexibility. Some package decisions my lock you into one vendor, while others may be flexible enough to run on multiple platforms. Lock-in may prove costly in the future if your requirements change. If you are using a cloud API, there is also a lock-in since cloud APIs are not interchangeable. If you have already chosen your cloud vendor, this may not be a factor.
Do you have the required specialized skills on your team to be successful? Fill any gaps by internal training, external hiring, or some form of professional services ranging from complete outsourcing to co-innovation to out-staffing.
Computer vision requires a lot of training data. Any new venture in CV suffers from the cold start problem, overcome only with vast amounts of initial data. Does your organization have data to train and test models, or will the dataset need to be acquired?
When do you need this to go live? What are the external factors driving the schedule decisions? Do you need to scale up your infrastructure dramatically to build these new features, or are they more interested in testing and trying new things? The level of commitment determines the urgency.
Using the figure below, you can get a graphical representation of your needs for the decision criteria we have assembled. If you have additional considerations unique to your project, add your sliders. You can then compare your slider positions with the slider ranges illustrated below. Match the sliders to determine what type of CV vender best fits your requirements (you can see some sample markers on our diagram. The dots represent an example of your needs for each factor).
So you have decided that computer vision fits into your company’s future. Now what? You might need help; maybe you cannot do it all yourself. There are support options available to you, depending on your needs.
Development resource companies divide into four segments; data providers, engineering services, vertical solutions, and platform providers. Each offers its clients different capabilities in developing and implementing computer vision solutions:
If you want to own the solution, have the necessary data, and are not on a tight schedule, you may want to consider building it yourself and take advantage of in-house resources. Before looking outside, you may identify some resources you already have in-house, such as having engineers knowledgeable in open-source or cloud API programming. Your decision to use outside resources is not a binary one. You can take advantage of some in-house resources while shopping out others. You have what you need. Go for it!
You want to quickly build a customized solution yourself and have the skills and the team to do it, but don’t have the required annotated images and training data. Annotation services vendors specialize in providing this data and annotation services to get your model operational. Images are pre-labeled, and quality is guaranteed.
If you want to build customized solutions and scale up the development of these solutions right away and have the data but not the skills, then a services company is the way to go. Hiring and retaining specialists will often take months, so a partnership with an engineering services vendor is an ideal solution. The proper developmental staff is assembled quickly with the skill sets you require for a fast development schedule. The team works full time on the project and then are rotated elsewhere at the conclusion.
Good services companies bring expertise, knowledge sharing, and agile best practices. They collaborate on innovation projects, augment teams, and achieve continuous integration / continuous delivery methodologies and new ideas and perspectives. Incidentally, Grid Dynamics, author of this blog, is a pure-play digital transformation engineering services provider in this space.
Your journey into AI and CV is just beginning. You are looking for ground-up development, and your requirements are basic. Several vendors can deliver application development, infrastructure, architecture, and systems integration in a pre-packaged solution.
There are two flavors of this approach. Some companies provide vertical solutions to narrowly defined specific use cases, such as e-commerce. If your enterprise falls into a category covered by a vertical solution provider, you have it all worked out for you. Otherwise, you can get pre-packaged generalized solutions from computer vision platform providers, which may address most of your requirements to get you started.
Specialty vertical solution providers offer complete packages tailored to a specific industry. The specialized needs of the company are already built into the mostly pre-packaged solution. Integration is easy, with limited customization options. The pre-packaged nature of these solutions allows for fast deployment for companies with schedule constraints.
Platform providers offer generalized, almost off-the-shelf solutions. Platform providers are ideal for smaller companies that cannot afford extensive customization or personalization. The application may be available pre-trained, or the company can retrain a generalized solution with its custom data. Due to the off-the-shelf nature of the product, they can be deployed quickly for clients with extremely tight schedule constraints.
If you decide to have a computer vision application custom-built to your specifications, you have a few more choices to make. We discuss where to host the application physically, and the types of code, proprietary or open-source, to use.
You can elect to have your application hosted by in-house servers that your company is responsible for maintaining, or you can host on a public cloud. In-house servers have fallen out of favor due to their acquisition and maintenance costs, as well as their lack of flexibility. Even if your company already has in-house servers, it may make sense to use cloud resources to prevent adding additional load to your existing hardware.
By far, the preferred platform solution is the public cloud. The top three vendors in this space are Amazon AWS, Microsoft Azure, and Google GCP. If you are new to cloud computing, I would strongly suggest using one of these three as they all have excellent customer relations and technical service to fit anything from on-man shops to multinational mega-corporations.
There is no reason to go the proprietary route. Gone are the days when you needed to write custom software to get what you need. Proprietary software may incur ongoing costs of royalties and lock you into a sometimes archaic architecture. These are needless limitations these days.
If you decide to host on the cloud, you have another decision to make. Should you rely on a cloud API (Application Programming Interface) or use open-source modules? Each cloud vendor provides an API, which is an easy way to build functionality quickly. It does, however, have a downside. Each cloud vendor has its own set of APIs which tend to lock you into that cloud vendor. Cloud lock-in may not be an issue if you already have a preferred cloud vendor, and you have no intention of switching in the future. You should, however, be aware of this limitation in your decision-making process.
A single company does not support an open-source software module, but rather it is maintained by many members of the programming community. Licensing for these modules is accomplished for free through industry-standard documents such as the MIT license, Apache License, or the Common Development and Distribution License. Many industry heavyweights, such as Google, Facebook, Amazon, and others, contribute to the open-source community. These companies typically develop software for their use then release to the industry for the public good.
Cloud APIs and open-source overlap in functionality, but the good news is they are not mutually exclusive. You may like the functionality of a cloud API, so go ahead and use it. Other modules can still utilize open-source modules. One limitation with this model is, when choosing cloud APIs, you still have the restrictions mentioned earlier. You must pick an API from one cloud vendor only and are locked into that vendor. Open-source modules are portable, so that they can run on any cloud vendor’s hardware.
Computer vision is obtainable for many enterprises, large or small. CV development does not have to be hard if you use outside vendors who offer various degrees of support, However, choosing the right level of support and the right vendor is vital to a successful outcome to this process. I have presented an overview of the decisions involved in choosing vendors and assistance. For further details, contact the vendors’ sales departments. Grid Dynamics can answer any additional questions you may have about artificial intelligence and computer vision.
If you are interested in finding out more about computer vision, deep learning and artificial intelligence, please sign up for our newsletter. We cover actual success stories as well as deeper dives into the technologies that make it all happen.