Home Insights IoT platform: A starter kit for Azure

IoT platform: A starter kit for Azure

Dmitry Mezhensky

Dmitry Afonkin

May 17, 2023 • 8 min read

Table of Contents

The building blocks of a modern IoT platform
Azure IoT reference architecture
Conclusion

Industrial IoT (IIoT) has become mainstream in a broad range of industries, including manufacturing, supply chain, logistics, energy, smart cities, and agriculture. IoT solutions enable intelligent decision-making for use cases such as predictive maintenance, visual quality control, and anomaly detection. However, companies need to implement a reliable, secure, and high-performance IoT data management platform to achieve these goals–a challenging and expensive task to perform from scratch. Fortunately, most major cloud providers offer native IoT services that help to rapidly develop IIoT solutions.

In this blog post, we present a blueprint and reference implementation for an IoT platform in Microsoft Azure, which provides best-in-class cloud-native services for IoT. This reference implementation can be used as a starter kit to accelerate the delivery of applied IoT projects. The reference implementation is available on GitHub.

The building blocks of a modern IoT platform

A modern IoT platform has a complicated structure with many tiers for data processing, device management, system monitoring and machine learning (ML). Therefore, using various logic building blocks to implement such a complex system is recommended. The most important building blocks to consider are:

Data storage. A massive amount of semi-structured and structured data telemetry data needs to be stored. According to data engineering best practices, the storage should be cost-effective, abstracted from processing engines, and be well integrated with ML engines.
Data processing engine. Processing data from an edge site is mandatory before performing analytics and ML processes. The engine needs to run the data pipelines against a massive amount of data in a scalable and reliable way.
Device management system. Managing many devices manually is challenging, so this system is necessary for managing edge component updates, secure connectivity, auditing and monitoring.
ML engine. The ML engine is responsible for the entire ML development cycle, from model training to production use, and continuous improvements based on performance metrics. Moreover, model observability and A/B testing capabilities should also be part of the modern ML engine.
Contextualization. The telemetry data from devices, enriched with context data, enables logical and virtual representation of real-world processes. In the case of manufacturing equipment, these representations could be specifications, rooms, usage history, or anomaly indications. Manufacturing and supply chain devices might be represented as digital twins, reflecting valuable attributes and relationships. Modern contextualization technologies enable immersive digital twins with 3D visualization of rooms and equipment.
Edge runtime. To achieve ultimate data processing and ML inference cost-effectiveness, performance and reliability, edge runtime needs to work without being connected to cloud systems.

Considering the building blocks above, it’s possible to build a strategy by applying a bottom-up approach. In this case, the strategy starts with edge device solutions, then moves to cloud data services, and ends with data management:

Define the scope of IoT devices, sensors, and gateways for data transmission to the cloud.
Define a set of protocols and security requirements.
Determine business logic that can perform at the edge, such as data processing and ML inference. This enables a cost-effective data flow without transmitting unnecessary data.
Choose an edge runtime to execute business logic.
Choose a device management system to enable monitoring, software updating, and security.
Determine a strategy to store, process, and analyze data.
Choose the approach to contextualize data. One of the possible solutions is building digital twins that integrate all kinds of metadata and telemetry data into logically connected and reflected business contexts.

In addition to the given strategy and building blocks, for mission-critical systems, the overall architecture must possess maintainability, fault-tolerance, and high availability. Managed and serverless solutions can be used to accomplish these goals, as discussed in the following sections.

Azure IoT reference architecture

Developing all components of an IoT platform from scratch is expensive and ineffective. Thankfully, Azure cloud provides an ecosystem with managed and serverless services to effectively build a modern IoT Platform at lower cost and greater speed. The reference architecture is shown in Figure 1:

Figure 1: Azure IoT Platform architecture

From a high view, the system has the following input:

Time-series data from the manufacturing OPC Servers.
Telemetry data from the sensors via MQTT, TCP/IP and Modbus protocols.

To handle the input data streams, an adapter needs to be written using Azure IoT Hub SDK to route messages to IoT Edge Hub. This component is described in the Edge Deployment section below.

After data collection, the system is responsible for storing, processing, visualization, and obtaining deep insights into the actual environment’s current state. To achieve this, Azure Cloud services need to be integrated into the complex IoT system.

Firstly, Azure services must be mapped to the fundamental building blocks considered in the previous section.

Each mapped service and its overall role are discussed below.

Data storage, processing, and analytics in the cloud

As mentioned, storing massive amounts of telemetry data is an expensive and challenging task. To address this, Azure Cloud offers Azure Blob Storage and Azure Lake Store Gen2 as cheap and effective storage systems.

In terms of a modern, enterprise-grade cloud processing engine, Azure Cloud offers Azure Databricks. It’s a powerful and modern data lakehouse platform capable of performing data warehousing and analytics. The most important features are:

The platform enables users to build ETL (extract, transform, load) pipelines using SQL, Python, and Scala on top of Apache Spark, a data lake or custom tools.
Azure Databricks integrates with MLFlow to perform various ML tasks.
The platform supports the Delta Sharing open protocol to secure data sharing with partner systems.
The platform supports built-in visualization dashboards.

Moreover, Azure Databricks can easily integrate with the industry-standard visualization tool, Grafana.

Edge deployment

To improve data processing cost-effectiveness, it’s advisable to implement data transformations such as filtering, aggregation, deduplication, and enrichments with ML outcomes at an edge gateway before sending it to the cloud. This enables offloading impact on the cloud, decreasing overall traffic costs, improving performance, and in some cases, it’s the only way to build a solution when transferring data to the cloud is not possible. To address this task, Edge Stream Analytics Jobs and Azure Functions are deployed as modules that run in Azure IoT Edge. Moreover, Azure IoT Edge empowers an edge device as a gateway, and enables modules to be run based on docker-compatible containers with business logic written in customer-specific programming languages.

Azure provides a set of Azure IoT Edge modules to ensure deployments and routing capabilities. A key module called IoT Edge Hub is responsible for communication between the client devices, modules and Azure IoT Hub.

Azure IoT Edge enables the approach to deploy modules in a layered way by defining a hierarchy of devices. It brings scalability and maintainability at the edge site.

Besides Azure IoT Edge, we can use a Kubernetes runtime within Azure-provided hardware as a service called Azure Edge Stack. These managed devices are optimized for various edge workloads such as ML or data processing.

With many devices, it’s important to manage them reliably and securely. Azure IoT Hub can be used as a central message broker with a device management system to accomplish this goal. It’s responsible for registering devices and their security artifacts, like X.509 certificates or symmetric keys, and providing analytics and auditing.

Machine learning

Speaking of a modern IoT Platform, it’s essential to include ML components. ML enables anomaly detection, visual quality control and predictive maintenance. Hence, the major ML components of the system are Azure Machine Learning and the ML serving module at the edge site. The first service is responsible for managing the model development cycle, from data preparation to deploying a module to the edge. The second component is an IoT Edge Module, or a docker image in the case of Kubernetes runtime, that performs a model inference via REST API or GRPC.

Having deployed a model in production, the next critical task is monitoring performance and continuous improvement. To address this, the additional edge modules that calculate the ML observability metrics need to be built and sent to the cloud in a different data stream along with telemetry data. Metrics analysis with Azure Databricks enables leveraging ML transparency and receiving profound insights about the production model behavior.

Digital twins

The digitization of manufacturing and other IoT enterprises brings a high demand for immersive digital representations of real-world equipment. To achieve this sophisticated task, the Azure IoT ecosystem provides the Azure Digital Twins service.

The main features of this service include:

Representing the real-world relationships of things in the system as a live knowledge graph.
Providing the query engine to query through a knowledge graph.
Providing the live 3D visualization of a real connected environment.
Providing an integrated visual view of the metrics and related things.
Providing the notification system.
Enabling the integration with downstream services via Event Hubs and Event Grid.
Providing the modeling language called Digital Twins Definition Language.
Storing the current state of things.

For this article, we explore the case of a supply chain IoT network which has complicated dependencies, as shown in the figure below. Each actor of the network has its own configuration, connections and is monitored separately. Our IoT Platform Starter Kit for Azure comes not only with infrastructure automation, but also with an application running on top of it that mimics supply chain dependencies. Each component of the supply chain has attributes that are closely monitored: speed, temperature, humidity and other meaningful parameters.

The simulation script, which is deployed on top of the platform, simulates the net of devices, and sends topology and telemetry data. It’s usually a challenge for enterprises to visualize topology and dependencies, and provide tooling for the management. Azure Digital Twins Explorer addresses this problem, while Azure IoT Hub collects the information and represents it in Digital Twins Explorer, as shown below in Figure 3. For detailed information, please refer to the Github repository or contact the Grid Dynamics team.

In this figure, you can discover the components and relationships. For example, Air Shipment is an extended shipment entity that delivers by air. Therefore, the diagram above draws the logic model of the real system. The particular digital twin follows this logic to represent the real case. The specific case of the Air Shipment is shown in Figure 4:

Figure 4: Air Shipment example in Azure Digital Twins Explorer

You can inspect the twin’s properties based on the predefined schema on the right side of the twin model. The schema came from the generalized entity called “Shipment”, that has a set of fields describing real-world processes and things. It enables you to track and analyze the behavior of the real complex system in real-time and at scale.

To summarize, we have described how the cloud building block integrates into a complex IoT Platform, and how Azure enterprise-grade IoT services simplify building an IoT system and reduce time-to-market.

Conclusion

Even though building a modern IoT platform from scratch in the cloud may be sophisticated and require a lot of cross-domain expertise, successful implementation and adoption can be simplified with a technology partner who can help to avoid massive low-level work such as infrastructure automation, IAM permissions management, integrations and migrating applications to modern platforms.

Grid Dynamics has extensive experience developing IoT solutions, and offers various accelerators and starter kits based on Azure Cloud. We support cloud-native or cloud-agnostic approaches, and provide top industry expertise in building sophisticated IoT solutions in the cloud.

Get in touch with us to learn more about our IoT Platform starter kits.

IoT platform: A starter kit for Azure

The building blocks of a modern IoT platform

Azure IoT reference architecture