Vous êtes sur la page 1sur 3

RESEARCH

Lab 1
Rosely Peña[a] and Jonathan Uy[b]

Full list of author information is


available at the end of the article Abstract
The analysis focused only on local flights of one local airline company departing from NAIA
from August to December 2018. Feature engineering was performed to generate additional
features such as previous aircraft delay and presence of other flights 30 minutes before a
scheduled time of departure. Highest accuracy of 95.96% was obtained from the Gradient
Boosting classifier model with weather as the most important feature. Another model
without using the feature weather condition was tested to determine the internal factors that
influence flight delay. 87.5% accuracy was obtained from GBM classifier model with
concurrent flight as the top predictor. The study also classified delays into less than 30
minutes, less than 60 minutes and more than 60 minutes. With this, the models were able
to determine if a flight scheduled one day away will be delayed with 80% accuracy, flights
one week away with 70% accuracy, and two weeks away with 60% accuracy. Additional
improvement through feature engineering were also made to raise the performance.
Findings of the study can help stakeholders plan for efficient operations to avoid additional
variable costs caused by flight delays. Result of the study shows that congestion of flights
in NAIA is caused by insufficient runways and accessible airports near Metro Manila.

Keywords: frequent itemset mining; recommender system; collaborative filtering

1 Background
The analysis focused only on local flights of one local airline company departing from NAIA
from August to December 2018. Feature engineering was performed to generate additional
features such as previous aircraft delay and presence of other flights 30 minutes before a
scheduled time of departure. Highest accuracy of 95.96% was obtained from the Gradient
Boosting classifier model with weather as the most important feature. Another model
without using the feature
Page 2 of 3

weather condition was tested to determine the internal factors that influence flight delay. 87.5%
accuracy was obtained from GBM classifier model with concurrent flight as the top predictor.
The study also classified delays into less than 30 minutes, less than 60 minutes and more than 60
minutes. With this, the models were able to determine if a flight scheduled one day away will be
delayed with 80% accuracy, flights one week away with 70% accuracy, and two weeks away with
60% accuracy. Additional improvement through feature engineering were also made to raise the
performance. Findings of the study can help stakeholders plan for efficient operations to avoid
additional variable costs caused by flight delays. Result of the study shows that congestion of
flights in NAIA is caused by insufficient runways and accessible airports near Metro Manila.

2 Methods
Page 3 of 3

3 Results
4 Discussion
5 Conclusions
6 List of abbreviations
7 Declarations
8 Ethics approval and consent to participate
9 Consent for publication
10 Availability of data and materials
11 Competing interests
12 Funding
13 Authors’ contributions
14 Authors’ information

Author details
N/A
Acknowledgements
The completion of this study would not be possible without the guidance of the Big Data and Cloud Computing professors, Christian M.
Alis, PhD., Madhavi Devaraj, PhD, and Eduardo L. David, Jr. Deep gratitude is also given to AIM's Analytics, Computing, and Complex
Systems Laboratory for providing access to the Million Song dataset and the opportunity to do the project work using the laboratory's
supercomputer.

Vous aimerez peut-être aussi