A Hybrid Model for Classification of E-mail Fraud

T.O. Oyegoke

Research Article

A Hybrid Model for Classification of E-mail Fraud

by T.O. Oyegoke

International Journal of Applied Information Systems

Foundation of Computer Science (FCS), NY, USA

Volume 12 - Issue 39

Published: April 2022

Authors: T.O. Oyegoke

10.5120/ijais2022451926

PDF

T.O. Oyegoke . A Hybrid Model for Classification of E-mail Fraud. International Journal of Applied Information Systems. 12, 39 (April 2022), 13-24. DOI=10.5120/ijais2022451926

                        @article{ 10.5120/ijais2022451926,
                        author  = { T.O. Oyegoke },
                        title   = { A Hybrid Model for Classification of E-mail Fraud },
                        journal = { International Journal of Applied Information Systems },
                        year    = { 2022 },
                        volume  = { 12 },
                        number  = { 39 },
                        pages   = { 13-24 },
                        doi     = { 10.5120/ijais2022451926 },
                        publisher = { Foundation of Computer Science (FCS), NY, USA }
                        }

                        %0 Journal Article
                        %D 2022
                        %A T.O. Oyegoke
                        %T A Hybrid Model for Classification of E-mail Fraud%T 
                        %J International Journal of Applied Information Systems
                        %V 12
                        %N 39
                        %P 13-24
                        %R 10.5120/ijais2022451926
                        %I Foundation of Computer Science (FCS), NY, USA

Abstract

The study pre-processed e-mail data, formulated and validated a Particle Swarm Optimization (PSO)-based Back Propagation model for email fraud detection. This was done by the hybridization of two algorithms namely; Nature Inspired Algorithm and Artificial Neural Network. The dataset collected for the purpose of developing the model contained fraudulent mails (46.3%), Spam (32.6%) and Ham (21.1%) e-mails. 12,831 features were extracted after data preparation and cleaning, in which only 6,382 (49.7%) relevant features were selected using PSO. The model was simulated using 70% and 80% for training while 30% and 20% of datasets were used for testing respectively. The results of using the 30% and 20% testing dataset for the gradient-based BP algorithm showed that using the relevant features selected by PSO improved the accuracy by a value of 0.27% and 0.35% respectively while for the PSO-based BP algorithm, using the relevant features selected by PSO improved the accuracy by a value of 1.51% and 1.46% respectively. The results showed that using PSO-based BP had a better performance than gradient-based BP by a value of 1.48% and 2.72% for 30% training dataset and a value of 1.46% and 2.57% using the original features and the features selected using PSO respectively. The study concluded that the PSO-based BP algorithm was able to improve the performance of the Multi-Layer Perceptron compared to the Gradient-Based Back Propagation algorithm which has implications on improving advance fee fraud detection.

References

Oyegoke T. O., Amoo A. O., Aderounmu G. A. and Adagunodo E. R. (2020). An Email Classification Model for Detecting Advance Fee Fraud: A Conceptual Approach. Computing, Information Systems & Development Informatics Journal 11 (2), 91 -104
Jennings, R. (2009). Cost of Spam is Flattening — Our 2009 Predictions. Retrieved from http,//e-mail-museum.com/2009/01/28/cost-of-spam-is-flattening-our-2009-predictions/ on July 23, 2017.
Blanzieri, E. and Bryl A. (2008). A Survey of Learning-Based Techniques of E-Mail Spam Filtering. Artificial Intelligence Review 29(1), 63 – 92.
Australian Federal Police. (2010). Internet Fraud and Scams. Retrieved from http, //www.afp.gov.au/policing/e-crime/internet-fraud-and-scams.aspx on July 23, 2017.
Behdad, M., Barone, L., Bennamoun, M. and French, T. (2012). Nature-Inspired Techniques in the Context of Fraud Detection. IEEE Transactions on Systems, Man and Cybernetics – Part C, Applications and Reviews 42(6), 1273 – 1290.
Carcillo, F., Dal Pozzolo, A., Le Borgne, Y.-A., Caelen, O., Mazzer, Y. and Bontempi, G. (2017). Retrieved from http,//creativecommons.org/licenses/by-nc-nd/4.0/ on September 23, 2017.
Oyegoke T. O., Akomolede K. K., Aderounmu G. A. and Adagunodo E. R. (2021). A Multilayer Perceptron Model for Email Classification.
Bolton, R.J. and Hand, D.J. (2002). Statistical fraud detection, A review. Statistical Science Journal 17(3), 235 – 249.
Marrow, P. (2000). Nature-inspired computing technology and applications. BT Technology Journal 18, 13 – 23.
Dorigo, M. and Stutzl, T. (2004). Ant Colony Optimization. Bradford Company Publishers, Minnesota, MI.
de Castro, L.N. (2007). Fundamentals of Natural Computing, An Overview. Journal of Physics of Life Reviews 4, 1 – 36.
Nasser, M. and Seyed, J.M. (2014). Comparison of Particle Swarm Optimization and Back Propagation Algorithms for Training Feed-forward Neural Network. Journal of Mathematics and Computer Science 12, 113 – 123
Eiben, A.E. and Smith, J.E. (2003) Introduction to Evolutionary Computing. Springer-Verlag, Berlin, Heidelberg.
Fleming, P.J. and Purshouse, R.C. (2002). Evolutionary algorithms in control systems engineering, A survey. Control Engineering Practice 10(11), 1223 – 1241.
Nikolos, I., Valavanis, K., Tsourveloudis, N. and Kostaras, A. (2003). Evolutionary algorithm based offline/online path planner for UAV navigation. IEEE Transaction Systems, Man and Cybernetics 33(6), 898 – 912.
Jin, Y. and Branke, J. (2005). Evolutionary optimization in uncertain environments - A Survey. IEEE Transactions in Evolutionary Computing 9(3), 303 – 317.
Wertheimer, G. (1999). Gestalt theory reconfigured, Max Wertheimer's anticipation of recent developments in visual neuroscience. Journal of Perception. 28 (1), 5 – 15. PMID 10627849. doi,10.1068/p2883
Backstrom, L., Huttenlocher, D., Kleinberg, J. and Lan, X. (2006). Group formation in large social networks. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '06, 44 - 53. ISBN 1595933395. doi,10.1145/1150402.1150412.
Dempster, A. P. (1967). Upper and lower probabilities induced by a multivalued mapping. The Annals of Mathematical Statistics 38(2), 325–339.
Shafer, G. (1976). A Mathematical Theory of Evidence. Princeton University Press, New Jersey, USA.

Index Terms

Computer Science

Information Sciences

No index terms available.

Keywords

Email Machine Learning Advance Fee Fraud; Fraud detection Artificial Neural Network (ANN) Particle Swarm Optimization (PSO)