Académique Documents
Professionnel Documents
Culture Documents
Melissa Grayce
Northcentral University
Gartner defines The Internet of Things (IoT) as a network of physical objects containing
embedded technology to communicate event data based on the device sensing or interacting with
their internal or external environment (Gartner, n.d.). Experts expect the number of devices
within the IoT to reach 50 billion by 2024 (Alam, Mehmood, Katib, & Albeshri, 2016). This
number of devices would generate a massive amount of valuable data, but can the existing data
Answering that question is the goal of this paper. To begin the search for the answer the
first section describes the journey of Maersk to turn IoT data into information to assist their
organization in reaching business goals. Specifically, the case study section discusses their
business goals, data collection, and system composition. The next section presents information
on logic components and statistical techniques from studies demonstrating similar solutions from
data elements and solutions comparable to those from the Maersk case study. Finally, the data
mining theory section focuses on current and future machine learning research. Researchers
assert the machine learning technique, known as deep learning, provides insights from IoT data
not possible using other data mining algorithms (Alam et al., 2016). It is that assertion which
Case Study
Maersk A/S. It is the largest container shipping company operating in 343 ports across 121
countries (Murison, 2016). They understand the value of data and technology to their revenue.
Their journey in the value of data began with the information from Maersk.com. It is a
massive B2B transaction site supporting over 60,000 customers and averaging $1.3 million in
GRAYCEMTIM8130-8 3
revenue per hour (Sharma, Shrivastava, Laghate, & Mendonca, n.d.). The significant amount of
detail in this system introduced the company to the value of data mining related to supply chain
known as OceanPro. (Sharma et al., n.d.). The sole purpose of this organization is to enable
The Maersk journey in implementing IoT technology began in 2012 when they partnered
with Ericsson to install real-time monitoring across its fleet (Murison, 2016). Luckily, it was a
well-publicized journey, resulting in the availability of the information in this section related to
Business Goals
At Maersk, there are four business goals, which they improved through their IoT
initiatives. First, saving fuel is a business goal with a direct impact on the bottom line. They
installed flow meters and fuel sensors (Paris & Sudal, 2018). The company integrated data from
these sensors with weather and sea current forecasts to save fuel (Paris & Sudal, 2018). The
predictive models used in data mining can use historical and current data to predict fuel
consumption in time to allow for management decisions to reduce the impact of weather and
current changes. Equally crucial to managing fuel decisions at sea is monitoring refrigerated
containers.
While at sea, the containers are isolated from support in the event of a power failure
(Murison, 2016). The company uses over 300,000 containers to transport food items, which
require climate control throughout the shipment. Their business goal related to these containers
The next business goal is to decrease the manual inspection of the containers while at a
port. Historically, each container receives a costly pre-trip inspection (PTI); however, data from
the container sensors allow the company to remotely analyze the container and compare the
current condition to the expected condition (Murison, 2016). Not performing inspections of
every container reduces labor and expense, while increasing the safety of the port staff by
limiting their interaction with the containers (Murison, 2016). Limiting inspections is only one
The final goal is to reduce the idle time at ports. Through better coordination of port
activities, where predictive analysis aids the scheduling, the company can save time and money
(Paris & Sudal, 2018). These goals are all within reach using the data gained through the IoT
The Erikson and Maersk partnership deployed thousands of sensors to each ship, making
one ship is capable of transmitting more than 2 gigabytes of data to the Maersk systems each day
(Matthews, 2017). Maersk systems ingest the data directly from the sensors transmissions,
One of the systems consuming this data is the Remote Container Management (RCM)
system, which monitors the condition and temperature of the shipping containers. The company
was able to reduce the container inspection process through this system. Instead of conducting a
PTI on every container, the company monitors the condition of the container. If the container
performs according to expectations, then it only receives a quick visual inspection. Maersk used
this analysis to reduce the number of PTI by 60% (Murison, 2016). Not only has this decreased
cost and increased safety, but it also reduced the carbon emissions associated with the ship idling
GRAYCEMTIM8130-8 5
at a port.
Maersk also used RCM to monitor the temperature of the containers in real time. The
technology from Ericsson transmits vital statistics via satellite, including temperature, location
and power supply (Murison, 2016). The company established climate thresholds, based on
historical data associated with the products shipped, for each container in a current shipment.
If there is a deviation in the temperature of a container, then the company analyzes the
measurements to identify the extent of the problem and takes action to avoid damage to the cargo
(Murison, 2016). Over a fifteen-week pilot, the company remotely changed the temperature set
points in the containers to avoid a potential claim of lost product (Murison, 2016). These actions
enabled the company to use data for cost avoidance. While all of these goals are achievable with
the right data, the hardware and personnel supporting the systems will determine the ultimate
Exploring the hardware and personnel provides insight into the operations of Maersk
systems. First, there are three components involved in the RCM system, a GPS unit, a 3G SIM
card, and a GSM antenna. (Murison, 2016). The fleet of over 400 ships generates more than 30
terabytes of data per month (Matthews, 2017). This amount of data makes an on-premises
solution virtually impossible. To that end, Maersk announced Its new agreement to use
Microsoft’s Azure as its preferred cloud service provider (Matthews, 2017). The system
transmits all of the data to the cloud for processing. The cloud provider should be able to
accommodate any future data growth experienced at Maersk. A cloud solution provided future
While the hardware components and cloud deployment show how serious the company is
GRAYCEMTIM8130-8 6
about using data, their investment in personnel shows the same future vision. Maersk Data was
Communications System, a global satellite communication system, as well as, customer service
and document management systems before being bought by International Business Machines in
2004. Maersk Data was too small to meet the future needs of the shipping company and the
investment required to properly grow the IT group would remove focus from the core business of
Maersk (Sherriff, 2004). By selling it to IBM, Maersk Data got the investment it needed to keep
pace with the industry and Maersk got a technology supplier at the leading edge of the industry.
(Sharma et al., n.d.). One of the companies funded by OceanPro is Dhruv, a company with a
focus on data mining. Currently, Dhruv is creating an E2E tracking solution to track the progress
of a container across the ocean using existing data in Maersk systems (Sharma et al., n.d.).
Another company benefiting from OceanPro, Linkeddots, specializes in industrial IoT. They are
in the process of developing a hardware and provider agnostic application using GPS data to
provide visibility into the container as other companies transport it over land (Sharma et al.,
n.d.). Both of these solutions have the potential to increase profit at Maersk, without requiring
Maersk to build the solutions in house. Their investment in technology companies allows them
access to cutting-edge technology in the future without bearing full development costs. The
approach also has the side benefit of creating viable companies to be employers to many more
Technical Implementation
While there was a lot of published information on the IoT journey of Maersk, there were
GRAYCEMTIM8130-8 7
not many details on the actual design specifications. Therefore, this section derives its
information on logic components and statistical techniques from studies demonstrating similar
Logic Components
The Cross Industry Standard Process for Data Mining (CRISP-DM) project defined a six-
phase process for data mining efforts (Wirth & Hipp, 2000). In the modeling phase of data
mining, organizations use mathematical models to describe the business logic applied to the data.
The purpose of these models is to find patterns, which either describe the data or predict future
values based on the data. This process differs from traditional statistical analysis with its focus
on using inference to establish the parameters of the population (“What is data mining”, n.d.). In
data mining, the interaction between the data preparation and modeling iteratively define the
logic.
When considering the goal of monitoring container temperature, iteration to produce the
logical framework for a model begins with mining historical information regarding the container
type, shipping route, product type, and season to develop association rules to apply during
container monitoring. Previous research used the Apriori algorithm to find frequent itemsets and
extract 120 association rules from the input data including transit time, temperature, season,
conveyance, package, and product type (Wang & Yue, 2017). Maersk can use the same
technique to extract association rules from their historical shipping data. After validating the
rules generated by the algorithm, the company can use those rules as the threshold criteria for the
The goal of reducing container inspections can use outlier detection techniques. One
study used the K-means clustering algorithm to find groups in water consumption data (García
GRAYCEMTIM8130-8 8
Valverde, González, Quevedo Casín, Puig Cayuela, & Saludes Closa, 2015). The algorithm
processes through the data iteratively using the features provided to assign each data point to a
group (Trevino, 2016). After the algorithm created the groups and by default create the logic,
the researchers used scatter plots to visualize the data. The scatter plots illustrated the presence
of outliers, which required human intervention to either validate the cause or design the remedy.
Maersk could use the K-means clustering algorithm with their historical data to find groups of
shipment types related to container condition. After grouping the data, the company can use
scatter plots to visualize current data, enabling them to recognize the outliers and predict
container condition.
Statistical Techniques
One of the primary tools for predictive analytics is regression analysis, which is an
established statistical technique (Davenport, 2014). Regression analysis uses a model of the
relationship between variables to forecast the change of dependent variables based on the
analysis begins with an analyst defining the set of independent variables. The analyst performs a
regression analysis to identify the correlation of those variables to the dependent ones. This step
generally requires multiple iterations to produce the appropriate model. Once the analyst
establishes the model, they use the regression coefficients to generate a score forecasting the
likelihood of the future event (Davenport, 2014). To prove the slope of the regression differs
In data science, researchers use the t-test to determine if the difference between the two
sets of data has a statistical significance. Although there are different types of t-tests, most are
parametric tests, which generally requires the data to have a normal distribution. The analyst
GRAYCEMTIM8130-8 9
needs to plot the frequency to determine distribution. After determining the distribution, the next
step is to calculate the standard deviation and mean, which are inputs for the t-test calculation.
The larger the number of the t-statistic, the higher the evidence there is a statistical difference.
One of the issues with linear regression is that the analyst must choose the types of basic
functions (Mahdavinejad et al., 2018). Of course, choosing the parameters is difficult. In these
cases, using machine learning techniques allows the model to adjust the parameters of the basic
functions as it trains on a dataset (Mahdavinejad et al., 2018). This technique highlights the
iterative nature between using data mining in preprocessing and using data mining to model.
There is a fragile line between the portions of specific algorithms responsible for each activity.
time and historical feeds, creating information used to automatically make smart decisions (Alam
et al., 2016). IoT devices supply data, but data mining produces the patterns from which
organizations derive information desired for making decisions. Of course, the biggest challenge
in using data mining for IoT applications is the applicability of the conventional data mining
One problem with this research is trying to identify conventional data mining algorithms
from machine learning algorithms. Experts assert that mined datasets provide the input for
One study exploring the ability of conventional data mining algorithms to work for the IoT
GRAYCEMTIM8130-8 10
datasets used eight data mining algorithms, which were all machine learning algorithms (Alam et
al., 2016). One vendor site asserted that machine learning takes the concept of data mining to the
next level by using the algorithms to automatically learn from and adapt to the data (“Data
mining vs. machine learning: What’s the difference?”, 2017). Since machine learning appears to
be the future of data mining, the remainder of this section focuses on current and future machine
learning research.
Machine Learning
To better understand artificial intelligence (AI), machine learning, and deep learning, this
section begins with an abbreviated overview of how the three concepts fit together. Over 60
years ago, John McCarthy coined the term AI to refer to machines capable of performing tasks
generally associated with human intelligence, including understanding language, learning, and
problem-solving as well as recognizing objects and sounds (McClelland, 2017). Machines use
algorithms to preforms the tasks, but merely using algorithms does not make a machine
Machine learning is a type of AI, which describes the ability of a computer to receive a data set
and learn from it (Venkatesan, 2018). In this case, learning involves the computer adjusting its
algorithms inspired by the human brain (Brownlee, 2016). It is characterized by non-linear, high
parameter models containing sets of processing units, known as neurons, used to approximate the
relationship between inputs and outputs within a complex system (Zhang et al., 2018). The
algorithms are not task-based, but seek to discover representations from data input. The more
concepts, use their extreme learning ability to process an enormous amount of data and produce
highly accurate results (Alam et al., 2016). DLANNs achieve higher accuracy rates than other
conventional machine learning and data mining algorithms by using vectors of real numbers.
When inputs do not have a natural vector representation, embedding functions map
discrete objects, such as words, to vectors. Analysts typically use neural network embeddings to
find the nearest correlation between entities in a vector space, to provide input to a supervised
(Koehrsen, 2018). However, previous research extended the use of embeddings to create the
output schema for a knowledge base by using patterns within mentions of concepts to define
relations (Riedel, Yao, McCallum, & Marlin, 2013). Additional research extended the use of
& Maeda, 2017). This research shows the ability of machine learning to embrace ever more
complicated data types. While the research shows the art of the possible, it does not prove the
The study previously mentioned in this section compared the performance with the
classification accuracy (CA) of eight common machine learning algorithms on IoT datasets. The
study was performed on the Aziz supercomputer, which has a total of 11,904 cores in 496 nodes
and delivers a peak performance of 230 teraflops (Alam et al., 2016). While this level of
computing power is not available to business intelligence consumer, the results are still valid.
The study found that ANNs and DLANNs had the best CA but also required the most computing
power (Alam et al., 2016). The DLANN algorithm had the longest execution time (Alam et al.,
2016). However, the researchers suggest using Linear Discriminant Analysis (LDA) when
GRAYCEMTIM8130-8 12
processing time matters. LDA achieved a CA of 81.85% in 0.98 seconds compared to 99.52% in
12600 seconds by DLANN (Alam et al., 2016). It is an acceptable tradeoff for those businesses
without access to a supercomputer. There is no doubt that future research into DLANN
algorithms will work to decrease execution time by creating distributed and parallel processing
techniques.
Future Research
There are two additional areas of future research for DLANN algorithms. First, noisy
data exerts significant influence over these algorithms (Mahdavinejad et al., 2018). They require
additional research into removing noise to make them more commercially viable. One area
worth considering is contrastive principal component analysis (cPCA), which discovers low-
dimensional structures unique to a dataset useful for removing noise or selecting features (Abid,
Zhang, Bagaria, & Zou, 2017). The study introducing this technique conducted experiments
providing the ability of cPCA to identify dataset-specific patterns missed by PCA (Abid et al.,
2017). Extending this research to enable the technique to function inside of the DLANN
Another difficulty with neural network-based algorithms is their black box nature. Data
scientists cannot easily explain the rationale behind the model results (Mahdavinejad et al.,
2018). Specifically, the challenge is to identify the most critical descriptors or predictors and to
relate them to the property being modeled (Zhang et al., 2018). Some preliminary research into
One study provided three methods used to understand neural network models (Zhang et
al., 2018). First, Garson’s algorithm dissects the model weights to describe the relative
2018). Second, the study presents the Lek’s profile method, which explores the relationship of
the outcome variable and a predictor (Zhang et al., 2018). Finally, the researchers presented the
interpretable model to show classification or regression predictions (Zhang et al., 2018). While
researchers have begun to address noise in data, more is needed to make DLANNs commercially
viable.
Conclusion
The goal of this paper was to determine if the existing data mining tools were capable of
turning IoT data into valuable information. It began by exploring the business goals of Maersk
and how their data collection and analysis efforts lead to decisions, which enabled them to meet
their goals. Maersk illustrated their commitment to their digital journey by creating a startup
accelerator, OceanPro, to ensure the hardware and personnel supporting their data analysis
The next section presented information on logic components and statistical techniques
from studies demonstrating similar solutions from data elements and solutions comparable to
those from the Maersk case study. It showed how the Apriori algorithm could create the logic
for climate thresholds for containers, as well as, how the k-means clusters could be used for
outlier detection when determining container inspection requirements. The statistical techniques
Finally, the data mining theory section began with a discussion regarding the fine line
between data mining and machine learning. Since machine learning appears to be an extension
of data mining encompassing many of the same algorithms, the remainder of the section focused
on current and future machine learning research. The section began with an abbreviated
GRAYCEMTIM8130-8 14
overview of how artificial intelligence (AI), machine learning, and deep learning, both fit
together and are separate. DLANN algorithms achieve the highest accuracy rates by using
vectors and embeddings, but they require more computing resources and have significantly
slower performance than other conventional machine learning and data mining algorithms.
Future research into DLANN algorithms needs to focus on methods to reduce the influence of
noisy data and methods to explain the black box areas of the models. The overall answer to the
question is the tools exist to mine IoT data, but there is room for improvement.
GRAYCEMTIM8130-8 15
References
Abid, A., Zhang, M. J., Bagaria, V. K., & Zou, J. (2017). Contrastive Principal Component
Alam, F., Mehmood, R., Katib, I., & Albeshri, A. (2016). Analysis of eight data mining
algorithms for smarter internet of things (IoT). Procedia Computer Science, 98, 437–442.
http://dx.doi.org/10.1016/j.procs.2016.09.068
https://machinelearningmastery.com/what-is-deep-learning/
Cross, A. (2018). Data mining vs. machine learning: What’s the difference? Retrieved from
https://www.ngdata.com/data-mining-vs-machine-learning/
Data mining vs. machine learning: What’s the difference? (2017). Retrieved from
https://www.import.io/post/data-mining-machine-learning-difference/
https://hbr.org/2014/09/a-predictive-analytics-primer
García Valverde, D., González, D., Quevedo Casín, J. J., Puig Cayuela, V., & Saludes Closa, J.
(2015). Water demand estimation and outlier detection from smart meter data using
classification and big data methods. 2nd New Developments in IT & Water Conference,
things/
https://towardsdatascience.com/neural-network-embeddings-explained-4d028e6f0526
Mahdavinejad, M. S., Rezvan, M., Barekatain, M., Adibi, P., Barnaghi, P., & Sheth, A. P.
GRAYCEMTIM8130-8 16
(2018). Machine learning for internet of things data analysis: A survey. Digital
http://dx.doi.org/10.1016/j.dcan.2017.10.002
Matthews, K. (2017). What Maersk’s adoption of Microsoft Azure means for the future of
adoption-microsoft-azure-means-future-commercial-shipping-data/
McClelland, C. (2017). The difference between artificial intelligence, machine learning, and
artificial-intelligence-machine-learning-and-deep-learning-3aa67bff5991
Mishra, N., & Silakari, S. (2012). Predictive analytics: A survey, trends, applications,
Murison, M. (2016). Maersk and Ericsson collaborate for IIoT success story. Retrieved from
https://internetofbusiness.com/maersk-ericsson-iot-success/
Paris, C., & Sudal, M. (2018). With container ships getting bigger, maersk focuses on getting
ships-getting-bigger-maersk-focuses-on-getting-faster-11545301800
Riedel, S., Yao, L., McCallum, A., & Marlin, B. M. (2013). Relation extraction with matrix
factorization and universal schemas. Proceedings of the 2013 Conference of the North
Technologies, 74–84.
Sharma, M., Shrivastava, A., Laghate, G., & Mendonca, J. (n.d.). World’s largest shipping
company is looking for tech innovation. Indian startups maybe the answer. The Economic
GRAYCEMTIM8130-8 17
biz/startups/newsbuzz/worlds-largest-shipping-company-maersk-is-looking-for-tech-
innovation-indian-startups-maybe-the-answer/articleshow/68459732.cms
https://www.theregister.co.uk/2004/08/17/ibm_buys_maersk/
Song, Y., Batjargal, B., & Maeda, R. (2017). Finding the identical Ukiyo-e prints across multiple
http://dx.doi.org/0.11487/oukan.2017.0_E-4-5
https://www.datascience.com/blog/k-means-clustering
Venkatesan, M. (2018). Artificial intelligence vs. machine learning vs. deep learning. Retrieved
from https://www.datasciencecentral.com/profiles/blogs/artificial-intelligence-vs-
machine-learning-vs-deep-learning
Wang, J., & Yue, H. (2017). Food safety pre-warning system based on data mining for a
http://dx.doi.org/10.1016/j.foodcont.2016.09.048
techniques
Wirth, R., & Hipp, J. (2000). CRISP-DM: Towards a standard process model for data mining.
Zhang, Z., Beck, M. W., Winkler, D. A., Huang, B., Sibanda, W., & Goyal, H. (2018). Opening
GRAYCEMTIM8130-8 18
the black box of neural networks: Methods for interpreting neural network models in
http://dx.doi.org/10.21037/atm.2018.05.32