Research Article

A Hybrid Model for Classification of E-mail Fraud

by  T.O. Oyegoke
journal cover
International Journal of Applied Information Systems
Foundation of Computer Science (FCS), NY, USA
Volume 12 - Issue 39
Published: April 2022
Authors: T.O. Oyegoke
10.5120/ijais2022451926
PDF

T.O. Oyegoke . A Hybrid Model for Classification of E-mail Fraud. International Journal of Applied Information Systems. 12, 39 (April 2022), 13-24. DOI=10.5120/ijais2022451926

                        @article{ 10.5120/ijais2022451926,
                        author  = { T.O. Oyegoke },
                        title   = { A Hybrid Model for Classification of E-mail Fraud },
                        journal = { International Journal of Applied Information Systems },
                        year    = { 2022 },
                        volume  = { 12 },
                        number  = { 39 },
                        pages   = { 13-24 },
                        doi     = { 10.5120/ijais2022451926 },
                        publisher = { Foundation of Computer Science (FCS), NY, USA }
                        }
                        %0 Journal Article
                        %D 2022
                        %A T.O. Oyegoke
                        %T A Hybrid Model for Classification of E-mail Fraud%T 
                        %J International Journal of Applied Information Systems
                        %V 12
                        %N 39
                        %P 13-24
                        %R 10.5120/ijais2022451926
                        %I Foundation of Computer Science (FCS), NY, USA
Abstract

The study pre-processed e-mail data, formulated and validated a Particle Swarm Optimization (PSO)-based Back Propagation model for email fraud detection. This was done by the hybridization of two algorithms namely; Nature Inspired Algorithm and Artificial Neural Network. The dataset collected for the purpose of developing the model contained fraudulent mails (46.3%), Spam (32.6%) and Ham (21.1%) e-mails. 12,831 features were extracted after data preparation and cleaning, in which only 6,382 (49.7%) relevant features were selected using PSO. The model was simulated using 70% and 80% for training while 30% and 20% of datasets were used for testing respectively. The results of using the 30% and 20% testing dataset for the gradient-based BP algorithm showed that using the relevant features selected by PSO improved the accuracy by a value of 0.27% and 0.35% respectively while for the PSO-based BP algorithm, using the relevant features selected by PSO improved the accuracy by a value of 1.51% and 1.46% respectively. The results showed that using PSO-based BP had a better performance than gradient-based BP by a value of 1.48% and 2.72% for 30% training dataset and a value of 1.46% and 2.57% using the original features and the features selected using PSO respectively. The study concluded that the PSO-based BP algorithm was able to improve the performance of the Multi-Layer Perceptron compared to the Gradient-Based Back Propagation algorithm which has implications on improving advance fee fraud detection.

References
  • Oyegoke T. O., Amoo A. O., Aderounmu G. A. and Adagunodo E. R. (2020). An Email Classification Model for Detecting Advance Fee Fraud: A Conceptual Approach. Computing, Information Systems & Development Informatics Journal 11 (2), 91 -104
  • Jennings, R. (2009). Cost of Spam is Flattening — Our 2009 Predictions. Retrieved from http,//e-mail-museum.com/2009/01/28/cost-of-spam-is-flattening-our-2009-predictions/ on July 23, 2017.
  • Blanzieri, E. and Bryl A. (2008). A Survey of Learning-Based Techniques of E-Mail Spam Filtering. Artificial Intelligence Review 29(1), 63 – 92.
  • Australian Federal Police. (2010). Internet Fraud and Scams. Retrieved from http, //www.afp.gov.au/policing/e-crime/internet-fraud-and-scams.aspx on July 23, 2017.
  • Behdad, M., Barone, L., Bennamoun, M. and French, T. (2012). Nature-Inspired Techniques in the Context of Fraud Detection. IEEE Transactions on Systems, Man and Cybernetics – Part C, Applications and Reviews 42(6), 1273 – 1290.
  • Carcillo, F., Dal Pozzolo, A., Le Borgne, Y.-A., Caelen, O., Mazzer, Y. and Bontempi, G. (2017). Retrieved from http,//creativecommons.org/licenses/by-nc-nd/4.0/ on September 23, 2017.
  • Oyegoke T. O., Akomolede K. K., Aderounmu G. A. and Adagunodo E. R. (2021). A Multilayer Perceptron Model for Email Classification.
  • Bolton, R.J. and Hand, D.J. (2002). Statistical fraud detection, A review. Statistical Science Journal 17(3), 235 – 249.
  • Marrow, P. (2000). Nature-inspired computing technology and applications. BT Technology Journal 18, 13 – 23.
  • Dorigo, M. and Stutzl, T. (2004). Ant Colony Optimization. Bradford Company Publishers, Minnesota, MI.
  • de Castro, L.N. (2007). Fundamentals of Natural Computing, An Overview. Journal of Physics of Life Reviews 4, 1 – 36.
  • Nasser, M. and Seyed, J.M. (2014). Comparison of Particle Swarm Optimization and Back Propagation Algorithms for Training Feed-forward Neural Network. Journal of Mathematics and Computer Science 12, 113 – 123
  • Eiben, A.E. and Smith, J.E. (2003) Introduction to Evolutionary Computing. Springer-Verlag, Berlin, Heidelberg.
  • Fleming, P.J. and Purshouse, R.C. (2002). Evolutionary algorithms in control systems engineering, A survey. Control Engineering Practice 10(11), 1223 – 1241.
  • Nikolos, I., Valavanis, K., Tsourveloudis, N. and Kostaras, A. (2003). Evolutionary algorithm based offline/online path planner for UAV navigation. IEEE Transaction Systems, Man and Cybernetics 33(6), 898 – 912.
  • Jin, Y. and Branke, J. (2005). Evolutionary optimization in uncertain environments - A Survey. IEEE Transactions in Evolutionary Computing 9(3), 303 – 317.
  • Wertheimer, G. (1999). Gestalt theory reconfigured, Max Wertheimer's anticipation of recent developments in visual neuroscience. Journal of Perception. 28 (1), 5 – 15. PMID 10627849. doi,10.1068/p2883
  • Backstrom, L., Huttenlocher, D., Kleinberg, J. and Lan, X. (2006). Group formation in large social networks. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '06, 44 - 53. ISBN 1595933395. doi,10.1145/1150402.1150412.
  • Dempster, A. P. (1967). Upper and lower probabilities induced by a multivalued mapping. The Annals of Mathematical Statistics 38(2), 325–339.
  • Shafer, G. (1976). A Mathematical Theory of Evidence. Princeton University Press, New Jersey, USA.
Index Terms
Computer Science
Information Sciences
No index terms available.
Keywords

Email Machine Learning Advance Fee Fraud; Fraud detection Artificial Neural Network (ANN) Particle Swarm Optimization (PSO)

Powered by PhDFocusTM