In an Industry 4.0 architecture, a historian acts as the long-term memory of production. While the Unified Namespace (UNS) always shows the current state of your machines, sensors, and processes, the historian continuously collects and stores relevant data over extended periods.
But what’s the value of such data?
In my experience, historical data becomes increasingly valuable when it is accessible to the right people. The historian allows process experts and production managers to retrieve past machine and process data in an organized way. Instead of tediously combing through old spreadsheets or scattered records, reliable and complete information is readily available in visualization tools like Grafana.
Here’s a few examples:
Depending on the use case, historians can be implemented in different ways.
Buy Historian software or build with Open Source data bases?
The idea of giving your production a long-term memory isn’t new. Even before AI-readiness became a buzzword, production managers and process experts realized the potential they were missing without storing process data in databases. However, the historian use case brings specific challenges for data storage.
Specialized historian software solutions (e.g., Canary Labs, AVEVA, or OSIsoft PI) are designed to address such requirements. However, you don’t necessarily need historian software to implement a historian. Increasingly, companies are opting to build their solutions on proven open-source technologies. Jeremy Theocharis from United Manufacturing Hub offers a detailed explanation of why this approach is worth considering in this video.
In practice, it could be realized like this...
A data logger collects relevant process data from your machine in near real-time and streams it via MQTT. An InfluxDB, optimized for time-series data, listens for updates and stores them directly at the machine (Edge Historian). Key data points are aggregated and stored alongside data from other production processes, orders, and quality data in an SQL database within the corporate network (Enterprise Historian). To keep databases manageable, older time-series data is compressed into Parquet files and stored in the company’s data lake, along with images and documents. From there, the data can be accessed anytime for visualization and analysis.
You might think, “Building such a data architecture must be resource-intensive.” And you wouldn’t be wrong.
And you wouldn't be entirely wrong about that. I regularly speak with SMEs that began with specific innovation ideas and ended up:
We believe the path to a "Smart Factory" for SMEs must be leaner. This belief shapes our principles for developing PREKIT.
The first PREKIT module is delivered as a preconfigured industrial computer, complete with an integrated InfluxDB that locally stores process data and makes it available through apps for visualization and analysis. At any time, additional PREKIT modules can be flexibly added to consolidate data from the entire production line, whether in the company network or the cloud. This allows the entire reference architecture described above to be realized almost "out of the box"—as a flexible and fully open platform.