Prediction Of COVID-19 Severity by Applying Machine and Deep Learning Techniques
Abstract
This project focuses on predicting the severity of COVID-19 by applying machine learning and deep learning techniques to processed textual and numerical data. The dataset, enhanced with a "Used Technologies" label, includes country-wise case statistics and descriptive information related to COVID-19. Natural Language Processing (NLP) techniques using NLTK were applied to clean and analyze text data. The textual information was then transformed into numerical form using TF-IDF vectorization, enabling the integration of text-based insights with structured numerical data. Subsequent analysis involved computing the distribution of technologies mentioned, and aggregating critical, active, and new case counts for each country. The dataset was split into training and testing subsets, and a Random Forest algorithm was implemented to predict COVID-19 severity indicators based on the available features. The trained model achieved an accuracy of 80%, as verified through confusion matrix evaluation.The study demonstrates the potential of combining text mining, statistical analysis, and machine learning models to predict the severity and spread of COVID-19. These insights can aid in early detection, policy planning, and resource allocation during pandemic scenarios.
