IAMM: A Maturity Model for Measuring Industrial Analytics Capabilities in Large-scale Manufacturing Facilities

Industrial big data analytics is an emerging multidisciplinary field, which incorporates aspects of engineering, statistics and computing, to produce data-driven insights that can enhance operational efficiencies, and produce knowledgebased competitive advantages. Developing industrial big data analytics capabilities is an ongoing process, whereby facilities continuously refine collaborations, workflows and processes to improve operational insights. Such activities should be guided by formal measurement methods, to strategically identify areas for improvement, demonstrate the impact of analytics initiatives, as well as deriving benchmarks across facilities and departments. This research presents a formal multi-dimensional maturity model for approximating industrial analytics capabilities, and demonstrates the model’s ability to assess the impact of an initiative undertaken in a real-world facility.


INTRODUCTION
Modern manufacturing facilities are becoming increasingly more data-intensive.Such environments support the transmission, sharing and analysis of information across pervasive networks to produce data-driven manufacturing intelligence (Chand and Davis 2010;Davis et al. 2012;Lee, Kao, and Yang 2014).This intelligence may provide many benefits, including improvements in operational efficiency, process innovation, and environmental impact, to name a few (Fosso Wamba et al. 2015;Hazen et al. 2014).To realize these benefits industrial information systems must be capable of storing and processing exponentially growing datasets (i.e.Big Data), while supporting predictive and scenario analytics to inform real-time decision-making (Fosso Wamba et al. 2015;Kumar et al. 2014;Lee et al. 2013;McKinsey 2011;Philip Chen and Zhang 2014;Verabaquero, Colomo-palacios, and Molloy 2014).Greater data production may be attributed to increased sensing capabilities, and persistence of higher resolution operational data.These sensing technologies encompass both legacy automation networks and emerging paradigms (e.g.Internet of Things and Cyber Physical Systems) (Davis et al. 2012;Lee, Bagheri, and Kao 2015;Wright 2014).The data collected from these networks may be analyzed and modeled to produce data-driving insights.These technologies and processes are becoming synonymous with industrial big data analytics, which incorporates aspects of big data analytics, automation, control and engineering.
Given the contemporary and multidisciplinary nature of industrial big data analytics, measuring current industrial analytics capabilities can be difficult.Such measurements could identify areas for strategic improvement, while also illustrating the impact of historical initiatives.In other business domains, capability assessment has been achieved using maturity models.While maturity models exist for aspects of industrial analytics (e.g.big data), they do not capture the dimensions or details needed to support capability assessment of the industrial domain.Thus, this research presents the development and application of an industrial analytics maturity model to approximate capabilities across numerous operating dimensions.

RELATED WORK
Given the contemporary, diverse and multidisciplinary nature of industrial analytics, determining current capabilities and developing strategic roadmaps may prove difficult.Many of these challenges are addressed in other domains using maturity models, which approximate capabilities and highlight strengths and weaknesses in a particular area (Ayca et al. 2016).Examples of such domains include Information Technology, Software Engineering, Data Management, and Business Process Management, to name a few (Koehler, Woodtly, and Hofstetter 2015;Ngai et al. 2013;Ofner, Otto, and Österle 2015;Oliva 2016;Torrecilla-Salinas et al. 2016).While there are currently no maturity models focused specifically on industrial analytics, several models exist for measuring Big Data and Internet of Things (Halper and Krishnan 2014;IBM 2016;IDC 2016;Infotech 2016;Knowledgent 2016;Potter 2014;Radcliffe 2014) capabilities.These models are predominantly of commercial origin with insufficient documentation to support assessment, while their methodological and theoretical foundations are unclear.
Maturity models reflect aspects of reality to classify capabilities (Kohlegger, Maier, and Thalmann 2009), which may be used for comparison and benchmarking (Rajterič 2010).Such models typically comprise dimensions and levels.Levels are ordinal labels that signify stages of maturity, while dimensions represent specific capabilities from the domain of interest.These dimensions may be further populated (e.g.technologies and processes) to facilitate deeper capability assessments (Lahrmann and Marx 2010).The contents of each dimension may by derived using qualitative research methods, including case studies, focus groups and the Delphi method (Lahrmann et al. 2011).Given the potential sophistication of some models, models are generally limited to measuring a particular aspect of a domain (Rajterič 2010), although multiple models can be aligned to facilitate broader assessments.However, aligning multiple models can be challenging when different dimensions and levels exist (Kohlegger, Maier, and Thalmann 2009).
The common criticisms associated with maturity models include insufficient accuracy, poor documentation, inadequate theory, and design bias (Dinter 2012;Lahrmann et al. 2011;Lahrmann and Marx 2010) -Dinter concluded maturity models cannot mitigate biases, even when empirical methods exist (Dinter 2012), while Lahrmann et al. (2010) reported many models are poorly documented and theoretically weak (Lahrmann and Marx 2010).There are three well-established development methodologies found in literature -De Bruin et al. (De Bruin et al. 2005), Becker et al. (Becker, Knackstedt, and Pöppelbuß 2009) and Mettler (Mettler 2009).These methodologies describe iterative approaches that facilitate continuous model improvement (Dinter 2012;Poeppelbuss et al. 2011).Therefore, maturity models must be refined and improved to reflect the nuances of the domain.
Given the contemporary and multidisciplinary nature of industrial analytics, determining current capabilities and creating strategic roadmaps can be challenging.Many of these challenges are addressed in other domains using maturity models (Koehler, Woodtly, and Hofstetter 2015;Lahrmann et al. 2011;Ngai et al. 2013;Ofner, Otto, and Österle 2015;Oliva 2016;Torrecilla-Salinas et al. 2016).Although closely related maturity models exist for mainstream Big Data and Internet of Things (Halper and Krishnan 2014;IBM 2016;IDC 2016;Infotech 2016;Knowledgent 2016;Potter 2014;Radcliffe 2014), these models do not possess the depth needed to measure industrial analytics capabilities.

RESEARCH METHODOLOGY
This research employs an action research approach to design and test a maturity model for measuring industrial analytics capabilities (De Villiers 2005).This approach was chosen given its ability to link theory and practice when investigating real-world challenges (Abdel-Fattah 2015).This research presents a maturity model to address measurement, comparison and benchmarking challenges pertaining to industrial analytics capabilities.The maturity model development process of De Bruin et al. (De Bruin et al. 2005) was used to construct the Industrial Analytics Maturity Model (IAMM).This process consisted of six sequential phases (Figure 1), with each phase containing criteria that characterized the model.

Phase 1 -Scope
The scope phase defines model boundaries using predefined criteria (Table 1).A model's focus can be domain-specific or generic.Generic models are those that may be applied across different domains (e.g.quality), while domainspecific models are coupled to a particular scenario (e.g.software development).Those that have an implied interest in the model's creation are known as development stakeholders.These stakeholders can inform the model's development, or benefit from its application.Examples of stakeholders may include academia, practitioners, and government entities.
The IAMM was classified as domain-specific given its focus on industrial analytics, with academic researchers and industry practitioners identified as development stakeholders.These stakeholders were deemed relevant given the model enables them to (a) illustrate current capabilities, (b) highlight areas for improvement, and (3) measure the impact of initiatives.These choices are highlighted in the selection column (Table 1).

Criteria
Options Selection

DEVELOPMENT FRAMEWORK
The importance of a standard development framework is emphasised when considering the purpose for which a model may be applied including whether the resulting maturity assessment is descriptive, prescriptive or comparative in nature.If a model is purely descriptive, the application of the model would be seen as single point encounters with no provision for improving maturity or providing relationships to performance.This type of model is good for assessing the here-and-now i.e. the as-is situation.A prescriptive model provides emphasis on the domain relationships to business performance and indicates how to approach maturity improvement in order to positively affect business value i.e. enables the development of a road-map for improvement.A comparative model enables benchmarking across industries or regions.A model of this nature would be able to compare similar practices across organizations in order to benchmark maturity within disparate industries.A comparative model would recognize that similar levels of maturity across industries may not translate to similar levels of business value.It is argued that, whilst these model types can be seen as distinct, they actually represent evolutionary phases of a model's lifecycle.First, a model is descriptive so that a deeper understanding of the as-is domain situation is achieved.A model can then be evolved into being prescriptive as it is only through a sound understanding of the current situation that substantial, repeatable improvements can be made.
Finally, for a model to be used comparatively it must be applied in a wide range of organizations in order to attain sufficient data to enable valid comparison.The proposed standard development framework forms a sound basis to guide the development of a model through first the descriptive phase, and then to enable the evolution of the model through both the prescriptive and comparative phases within a given domain.Furthermore, we propose that, whilst decisions within the phases of this framework may vary, the phases themselves can be reflected in a consistent methodology that is able to be applied across multiple disciplines.Figure 1 summarises the phases included in the generic framework.Whilst these phases are generic, their order is important.For example, decisions made when scoping the model will impact on the research methods selected to populate the model or the manner in which the model can be tested.In addition, progression through some phases may be iterative, for example it may be a case of 'design', 'populate' and 'test' and dependent upon the 'test' results, necessary to re-visit and adjust decisions made in earlier phases.The usefulness of this lifecycle model is best reflected by showing how it has been applied for the independent development of the BPMM and KMCA models.

Phase 2 -Design
The design phase defines model architecture and application using predefined criteria (Table 2).These criteria provide a deeper understanding of (1) who will use the model, (2) why they need the model, and (3) how they can apply the model.These design details must manage the trade-off between domain accuracy and model simplicity.While simple models may not reflect the nuances of the domain, complex models may create user adoption challenges (e.g.timeconsuming assessment process).
The IAMM's audience was classified as internal executives and management, given they are responsible for improving in-house industrial analytics capabilities.A self-assessment method controlled by staff members was chosen to measure analytics capabilities, which would be driven by internal roadmaps and objectives (e.g.smart manufacturing).These assessments should consider multiple perspectives and dimensions (e.g.automation and mainstream technology) to evaluate maturity.A maturity model structure and application may take two forms.First, models may employ a multi-level approach.These models adhere to the continuous maturity principle, where multiple dimensions of the model may assert different maturity levels.This approach is useful for modeling multifaceted domains, and highlighting strengths and weaknesses.Second, models may also employ a singlelevel approach.These models adhere to the staged maturity principle, which use a single label to classify maturity.This approach may suit scenarios where natural linear progressions exist (e.g.beginner to advanced).

Criteria Options Selection
The IAMM's architecture follows a multi-level approach given multiple disciplines exist in the industrial analytics domain (Table 3).This approach also provides the flexibility needed to align maturity assessment with operational goals and objectives (e.g.not all facilities may wish to enhance embedded analytics).

Dimension Levels Rationale
Open   it was used to measure the impact of an energy-focused industrial analytics initiative.

Model Validity
Potential threats to the IAMM's validity may be classified as those generally associated with maturity models, and those stemming from model-specific design.Some of these threats are described in Table 5.

Threat Discussion
Accuracy Given IAMM focuses on approximating industrial analytics capabilities for comparison and benchmarking, accuracy was not considered a major threat.We consider assessment consistency across longitudinal analysis as a greater threat.Such challenges may be addressed by refining assessment guidelines, but developing in-house assessment policies and procedures are equally important.

Scoring
There is an inherent trade-off between model granularity and usability.High-level models lack sufficient detail to guide assessment, while low-level models may come with significant overheads.IAMM adopts somewhat of a hybrid perspective, whereby a complete architecture guides assessment, but simplified scoring facilitates easy adoption.These trade-offs may be addressed in the future.

Bias
Maturity models are naturally subject to design bias.
Bias cannot be avoided completely given the level of interpretation involved in model construction.To mitigate direct researcher design bias, the IAMM architecture was formed using multiple operational perspectives acquired from the factory.Where userderived design biases exist, iterative refinement and practitioner feedback will facilitate their dilution.Coverage Measuring capabilities across entire domains is somewhat unrealistic.Hence, maturity models tend to address specific aspects of a particular domain.IAMM focuses on operational convergences associated with industrial analytics capabilities.
During model design particular capability components were filtered to ensure coherence, while trying to preserve important capability characteristics.Similarly to previous threats, gaps in domain coverage can be addressed using iterative model refinement and practitioner feedback.
Table 5. Summary of research validity threats

RESULTS AND DISCUSSION
This section describes the deployment and application of the IAMM to measure the impact of an energy-focused industrial analytics initiative in a large-scale manufacturing facility.The impact was determined using capability assessments recorded before and after the implementation of an industrial analytics architecture (O 'Donovan, Bruton, and O'Sullivan 2016).This capability assessment was undertaken to demonstrate the application and usefulness of the IAMM as a means of measuring change, and highlighting operational strengths and weaknesses in the context of data-driven energy operations.

Assessment Protocol
Figure 3 illustrates the assessment protocol used to measure industrial analytics capabilities in this research.The figure shows actions undertaken by each researcher (i.e. three assessors) in the outer section (e.g.score, reason etc.), which were collaboratively synthesized to derive final capability levels.This enabled researchers to make their own assertions regarding capability changes, while knowing any individual bias would eventually be diluted.Table 6 summarizes each step in this assessment protocol.

Figure 3. Capability assessment protocol
Step Description Score Each researcher evaluated and scored the hypothesis statements (Table 4) for before and after the implementation of the industrial analytics architecture.

Reason
For each score asserted, the researcher was required to rationalize their decision using a textual description.

Code
In addition to a textual description, the researcher was also required to explicitly label the architecture diagram to illustrate where they envisaged the capability improvement.

Discuss
After scoring, reasoning and coding all components in the model, the researcher presented their assertions, and these were discussed and evaluated by the group.Synthesize Finally, the individual assessments were synthesized during group discussions to form the final capability levels for before and after implementation.This unified capability data is presented and discussed in the following sections.This architecture was originally implemented to promote consistent data flows between multidisciplinary teams, establish clear boundaries and responsibilities, and classify data streams to facilitate industrial analytics.These streams are labeled as batch and real-time.Batch streams are responsible for acquiring, cleaning and serving operational data to build data-driven models, while real-time streams leverage these models to monitor and inform real-world factory operations.
The codes overlaid (e.g.D1.2) on the industrial analytics architecture correspond to the IAMM's hypothesis statements (Table 4).These codes were added during the assessment protocol, which required those undertaking capability assessments to explicitly highlight and rationalize assertions.The final codes indicate capability improvements were evident across operational convergences (e.g.integration and interoperability) and analytics pipelines (e.g.building and deployment).

Open Standards
Positive changes in standards were evident across all areas excluding operational technology (Figure 5).Open standards were used (e.g.OLE Process Control) for building automation and control (Hong and Jianhua 2006), while no standards existed to support integration with cloud computing and analytics frameworks.This resulted in capability improvements relating to D1.2, D1.4 and D1.5.These improvements are discussed in Table 7. Open standards are currently used for building automation and control, while the industrial analytics lifecycle implementation does not target improvements at this level (Bacnet 2006;Hong and Jianhua 2006;Kastner et al. 2005).Therefore, no capability changes were expected or recorded.

D1.2 Cloud-to-Factory Integration
The industrial analytics lifecycle implementation (Figure 6) shows Hypertext Transfer Protocol (HTTP) supporting factory-to-cloud integration (Verivue 2008).An improved capability of 'partial' was assigned given a proprietary software library was used to support aspects of integration.D1.3 Data I/O Acquisition OLEDB, ODBC and standard I/O streams could be used to access energy data from repositories on the network.Similarly to device standards, the implementation being assessed does not target improvements for factory-level I/O, and therefore, no capability changes were expected or recorded.

D1.4 Model
Energy focused data-driven models were not used before implementation.Therefore, no Building standards existed to support such models.The industrial analytics lifecycle implementation (Figure 6) utilizes Predictive Modeling Markup Language (PMML) (Data Mining Group 2016)to encode data-driven models.Full agreement with the hypothesis statement was chosen given there were no indications that PMML could not be used as the basis to encode future models.

D1.5 Model Scoring
Given the lack of data-driven models, standards to facilitate the scoring of energy data were not necessary.The industrial analytics lifecycle implementation (Figure 6) employs web services to score data-driven models.These services are initiated using HTTP requests, while data exchanges are facilitated using JavaScript Object Notation (JSON).Full agreement with the hypothesis statement was deemed appropriate given the complete use of standards from the client-side.

Operation Technology
Positive changes in operation technology largely stemmed from data accessibility and availability of cloud computing technologies (Figure 7).This resulted in capability improvements relating to D2.2 and D2.3.These improvements are discussed in Table 8.Full maturity was applied given the Building Management System (BMS) logs all energyrelated data points in the facility.

D2.2 Data Accessibility
Existing energy data repositories exhibited arbitrary naming conventions and were largely inaccessible to networked users and processes.Improvements were realized using a workflow engine to contextualize data segments, while processed data was accessible via HTTP.

D2.3 Cloud Integration
Solely in the context of energy operations, autoscaling compute resources were implemented to handle large-scale data processing and requests.Given the ingestion and processing of all energy data in the facility was previously demonstrated, full maturity was assigned in this instance.

D2.4 Resource Provisioning
No specific policies or processes existed to support provisioning of tools or technologies for industrial analytics.Given the technical nature of the industrial lifecycle implementation, such capabilities were not addressed or affected.

D2.5 Response Time
General policies for provisioning resources were not aligned with the quick turnaround times specified in the hypothesis statement.Given the technical nature of the industrial lifecycle implementation, such capabilities were not addressed or affected.

Information Technology
Given only minor convergences existed between operation and information technology for energy operations, many positive capability changes were observed (Figure 8).These improvements are discussed in Table 9.

D3.1 Data Management
While factory-level energy repositories used arbitrary naming for data points, the implemented data lake comprised many tags that described the origin and application of the data.These tags were used to form a catalogue to identify data sources for mapping and cleaning operations.

D3.2 Large-scale Processing
Given the auto-scaling configuration used during the industrial analytics lifecycle implementation, data ingestion and workflow processes exist to manage large datasets and interoperate with big data tools.

D3.3 Pipeline Automation
Formal implemented workflow processes facilitated the turnkey cleaning and transformation of energy data.This resulted in analytics-ready data being served to end-users and processes.

D3.4 Resource Provisioning
On-demand cloud computing enabled the seamless provisioning of virtual resources to support industrial analytics efforts.

D3.5 Response Time
Additional resources for existing infrastructure were automated to reduce provisioning time.

Data Analytics
Positive changes in data analytics were demonstrated by the use of statistical tools to apply analytical methods and deploy data-driven models (Figure 9).This resulted in capability improvements relating to D4.1, D4.3, D4.4 and D4.5.These improvements are discussed in Table 10.Existing information systems were used to display energy data and operations, with no apparent application of statistical data analysis.
Post-implementation such activities were demonstrated using R Studio and associated software packages.D4.2 Line-of-Business Reporting Some aspects of energy operations demonstrated ad hoc analysis using MS Excel and MS SQL.These capabilities were not targeted or affected after the lifecycle implementation.

D4.3 Descriptive Analytics
The implementation demonstrated descriptive analytics using RStudio to identify anomalies in time-series trends for Air Handling Units (AHU's) in the factory.These capabilities were directly enabled by the accessibility of clean and processed energy data from the workflow engine.

D4.4 Advanced Analytics
The implementation demonstrated advanced analytics capabilities by training a machine learning model to automatically identify issues with heating components in AHU's.These capabilities were informed by findings from previously mentioned descriptive analytics efforts.

D4.5 Model Deployment
The implementation facilitated the deployment of PMML encoded data-driven models to accessible cloud-based repositories.This enabled model to collaborate with scoring components to facilitate deployment in the factory.
Table 10.Data analytics assessment

Embedded Analytics
Positive changes in embedded analytics stemmed from the ability to operationalize analytics models informed by subject matters (Figure 10).This resulted in capability improvements relating to D5.1, D5.2, and D5.3.These improvements are discussed in Table 11.Incorporating subject matter expertise was facilitated by the analytics lifecycle, where knowledge relating to AHU diagnostics was to guide the construction and deployment of a diagnostics application.

D5.2 Operational Knowledge
This particular capability was graded 'partial' given expertise for industrial energy, utilities and diagnostics were used to demonstrate the analytics lifecycle implementation.

D5.3 System Integration
The operationalization of data-driven models for energy operations did not exist before the implementation of the industrial analytics lifecycle.The industrial analytics lifecycle demonstrated the integration of factory-level operations with analytics output via a diagnostic application embedded in the facility.

D5.4 Data Visualization
Different information systems were used in the factory to present and explore energy data recorded in the facility.The implementation did not extend these capabilities, which resulted in capabilities being unaffected.D5.5 Key Performance Metrics Internal metrics relating to energy consumption are used to gauge performance.Given the implementation did not enhance these capabilities, maturity levels remained the same.

CONCLUSIONS
There are many challenges associated with developing industrial analytics capabilities.Some common challenges include managing heterogeneous technologies and platforms, forming multidisciplinary teams, and formalizing prescriptive approaches, to name a few.Such challenges are exacerbated further where no methods exist to measure current capability levels, and strategically identify areas for improvement (e.g.technical roadmap).Thus, this research considered the use of maturity models to classify and quantify industrial analytics capabilities.
The industrial analytics maturity model (IAMM), which was developed during this research, was used to highlight capability improvements across energy operations after the execution of an industrial analytics initiative.These results showed positive improvements, but this was expected given energy operations had no analytics infrastructure before implementation.However, maturity assessments should not be considered isolated events, but rather a longitudinal process, where capability levels are continuously monitored, improved and compared.Such processes organically produce quantifiable benchmarks, which may be used to compare capabilities across departments and facilities.The IAMM provides a foundational framework for capability assessment, which researchers and practitioners may extend to meet specific requirements.Indeed, these refinements and extensions are necessary to improve the representation of the domain being assessed.Future work will focus on the refinement and extension of the current model, as well as the development of an IAMM compliant cloud-based web and mobile application to support ongoing capability assessment and reporting.

Figure 4
Figure4illustrates industrial analytics the synthesized capabilities across energy operations, before and after the implementation of the industrial analytics architecture.While the facility's traditional energy operations and systems were state-of-the-art, maturity assessments highlighted gaps between legacy and emerging technologies (e.g.data analytics).These gaps are assessed and discussed in the following sections.

Figure 4 .
Figure 4. Comparison of industrial analytics capabilities 4.2.Industrial Analytics Architecture Figure 6 illustrates the industrial analytics architecture for assessment (O'Donovan, Bruton, and O'Sullivan 2016).This architecture was originally implemented to promote consistent data flows between multidisciplinary teams, establish clear boundaries and responsibilities, and classify data streams to facilitate industrial analytics.These streams are labeled as batch and real-time.Batch streams are responsible for acquiring, cleaning and serving operational data to build data-driven models, while real-time streams leverage these models to monitor and inform real-world factory operations.

Figure
Figure 5. Open standards comparison

Figure
Figure 7. Operation technology comparison Component Rationale D2.1 Data Archiving

Figure
Figure 9. Data analytics comparison Component Rationale D4.1 Data Modelling

Table 1 .
Scope criteria selection for IAMMWhilst maturity models are high in number and broad in application, there is little documentation on how to develop a maturity model that is theoretically sound, rigorously tested and widely accepted.This paper seeks to address this issue, by presenting a model development framework applicable across a range of domains.Support for this framework is provided through the presentation of the consolidated methodological approaches, including testing, undertaken by two universities while independently developing maturity models in the domains of Business Process Management (BPM) and Knowledge Management (KM) respectively.Throughout this paper, these models will be referred to as the Business Process Management Maturity (BPMM) model and the Knowledge Management Capability Assessment (KMCA) model.This paper is structured so that the generic phases required for development of a general maturity model are identified first.Next, each phase is discussed in detail using the two selected maturity models as examples.Finally, conclusions are drawn regarding the potential benefits from utilisation of such a model and limitations and future research are identified.

Table 2 .
Design criteria selection of IAMM

Table 3 .
IAMM architecture and dimensions

Table 4 .
Industrial analytics maturity model assessment analysis activities, further testing the model's alignment with the domain was not deemed necessary (O'Donovan,  Bruton, and O'Sullivan 2016).This enabled the deployment of the IAMM to a large-scale manufacturing facility, where Industrial Analytics Maturity Model (IAMM)

Table 6 .
Capability assessment protocol

Table 7
Open standards assessment

Table 8 .
Operation technology assessment

Table 9 .
Information technology assessment