Quantum computing and machine learning for Arabic language sentiment classification in social media

Omar, Ahmed; Abd El-Hafeez, Tarek

doi:10.1038/s41598-023-44113-7

Download PDF

Article
Open access
Published: 12 October 2023

Quantum computing and machine learning for Arabic language sentiment classification in social media

Ahmed Omar¹ &
Tarek Abd El-Hafeez^1,2

Scientific Reports volume 13, Article number: 17305 (2023) Cite this article

2012 Accesses
7 Citations
1 Altmetric
Metrics details

Subjects

Abstract

With the increasing amount of digital data generated by Arabic speakers, the need for effective and efficient document classification techniques is more important than ever. In recent years, both quantum computing and machine learning have shown great promise in the field of document classification. However, there is a lack of research investigating the performance of these techniques on the Arabic language. This paper presents a comparative study of quantum computing and machine learning for two datasets of Arabic language document classification. In the first dataset of 213,465 Arabic tweets, both classic machine learning (ML) and quantum computing approaches achieve high accuracy in sentiment analysis, with quantum computing slightly outperforming classic ML. Quantum computing completes the task in approximately 59 min, slightly faster than classic ML, which takes around 1 h. The precision, recall, and F1 score metrics indicate the effectiveness of both approaches in predicting sentiment in Arabic tweets. Classic ML achieves precision, recall, and F1 score values of 0.8215, 0.8175, and 0.8121, respectively, while quantum computing achieves values of 0.8239, 0.8199, and 0.8147, respectively. In the second dataset of 44,000 tweets, both classic ML (using the Random Forest algorithm) and quantum computing demonstrate significantly reduced processing times compared to the first dataset, with no substantial difference between them. Classic ML completes the analysis in approximately 2 min, while quantum computing takes approximately 1 min and 53 s. The accuracy of classic ML is higher at 0.9241 compared to 0.9205 for quantum computing. However, both approaches achieve high precision, recall, and F1 scores, indicating their effectiveness in accurately predicting sentiment in the dataset. Classic ML achieves precision, recall, and F1 score values of 0.9286, 0.9241, and 0.9249, respectively, while quantum computing achieves values of 0.92456, 0.9205, and 0.9214, respectively. The analysis of the metrics indicates that quantum computing approaches are effective in identifying positive instances and capturing relevant sentiment information in large datasets. On the other hand, traditional machine learning techniques exhibit faster processing times when dealing with smaller dataset sizes. This study provides valuable insights into the strengths and limitations of quantum computing and machine learning for Arabic document classification, emphasizing the potential of quantum computing in achieving high accuracy, particularly in scenarios where traditional machine learning techniques may encounter difficulties. These findings contribute to the development of more accurate and efficient document classification systems for Arabic data.

Character gated recurrent neural networks for Arabic sentiment analysis

Article Open access 13 June 2022

Rise and fall of the global conversation and shifting sentiments during the COVID-19 pandemic

Article Open access 17 May 2021

On the use of aspect-based sentiment analysis of Twitter data to explore the experiences of African Americans during COVID-19

Article Open access 02 July 2023

Introduction

The Arabic language is one of the most widely spoken languages in the world, with over 420 million speakers. With the increasing amount of digital data generated by Arabic speakers, the need for effective and efficient document classification techniques is more important than ever. Document classification is a fundamental task in natural language processing that involves assigning predefined categories to text documents based on their content. This task has numerous applications in various fields, including information retrieval, sentiment analysis, spam filtering, and news categorization¹.

In recent years, both quantum computing and machine learning have shown great promise in the field of document classification. Quantum computing is a rapidly evolving field that leverages the principles of quantum mechanics to perform computations that are intractable for classical computers. Quantum algorithms have been proposed for various machine learning tasks, including data classification, clustering, and dimensionality reduction. These algorithms offer the potential for exponential speedup over classical algorithms for certain problems, which can be particularly beneficial for large-scale data processing tasks².

Quantum Computing and Machine Learning are two cutting-edge technologies that can be used for Arabic language sentiment classification in social media. Sentiment analysis is the process of automatically identifying and classifying the sentiment expressed in a piece of text, such as a social media post or a product review. This task is particularly challenging for Arabic language due to its complex grammar and the variability of expressions and dialects used in social media^3,4.

Quantum computing can potentially offer a significant speedup for sentiment analysis tasks, particularly for large datasets. The quantum algorithm for sentiment analysis involves mapping the text data to a quantum state and then applying quantum operations to extract the sentiment information. However, the current state of quantum computing hardware and software is still in its early stages of development and is not yet widely available⁵.

On the other hand, machine learning has become a dominant approach for document classification in recent years. Machine learning algorithms, such as support vector machines, decision trees, and neural networks, have been widely used to classify text documents based on their content. These algorithms learn from labeled data to identify patterns and relationships in the data and use them to classify new documents^6,7.

To perform sentiment classification in Arabic language social media, a combination of both quantum computing and machine learning can be used. The quantum algorithm can be used to preprocess the data and extract the sentiment information, while the machine learning algorithm can be used to train a model that can accurately classify the sentiment in new, unlabeled data⁸.

Quantum computing is a promising technology that has the potential to revolutionize computing as we know it. However, Table1 summarizes the advantages and disadvantages of quantum computing.^9,10,11,12.

Table 1 Advantages and disadvantages of quantum computing.

Full size table

Machine learning is a subset of artificial intelligence that involves the use of algorithms to enable machines to learn from data and make predictions or decisions without being explicitly programmed. We will discuss some of the advantages and disadvantages of machine learning that have been highlighted in recent academic references^{21,22,23,24,25,26,27,28}. Table 2 summarizes the Advantages and disadvantages of machine learning.

Table 2 Advantages and disadvantages of machine learning.

Full size table

Despite the significant progress made in document classification using quantum computing and machine learning, there is a lack of research investigating the performance of these techniques on the Arabic language. This is partly due to the unique characteristics of the Arabic language, such as its rich morphology, complex syntax, and diacritic marks, which pose challenges for natural language processing techniques. Figures 1 and 2 Illustrate of classical and quantum machine learning models.

Therefore, this paper aims to contribute to the development of effective and efficient document classification techniques for Arabic language data. Specifically, we present a comparative study of quantum computing and machine learning for Arabic language document classification. We compare the performance of a quantum computing-based algorithm and a machine learning-based algorithm on a dataset of Arabic language documents. We evaluate the performance of the algorithms using standard classification metrics such as accuracy, precision, recall, and F1 score.

The main contribution of this study is to provide insights into the strengths and limitations of quantum computing and machine learning for Arabic language document classification. The results of this study can inform the development of more accurate and efficient document classification systems for Arabic language data. Additionally, this study can contribute to the broader research efforts aimed at exploring the potential of quantum computing in natural language processing tasks. The detailed abbreviations and definitions used in the paper are listed in Table 3.

Table 3 List of abbreviation and acronyms used in the paper.

Full size table

Problem statement

Sentiment analysis in social media has become a crucial task for understanding public opinion and sentiment towards various topics. With the exponential growth of digital data generated by Arabic speakers on social media platforms, there is an urgent need to develop effective techniques for sentiment classification in Arabic language texts. While both quantum computing and machine learning have shown promise in the field of sentiment analysis, their performance specifically on Arabic language texts in the context of social media remains largely unexplored. This research aims to address this gap by investigating the applicability and performance of quantum computing and machine learning approaches for sentiment classification in Arabic language social media data. The paper aims to investigate the performance of quantum computing and machine learning techniques for sentiment classification in Arabic language social media texts, specifically focusing on their applicability and effectiveness in this context. The study seeks to explore and compare the accuracy, precision, recall, and F1 scores achieved by these approaches, as well as their computational efficiency, in order to contribute to the development of more accurate and efficient sentiment analysis systems for Arabic language texts. The findings of this study can provide valuable insights into the strengths and limitations of quantum computing and machine learning for sentiment classification in Arabic social media data.

The main contribution of this paper can be summarized as follows:

Identification of the need for effective and efficient document classification techniques for the growing amount of digital data generated by Arabic speakers in various domains, emphasizing the significance of addressing this need in the present context.
Recognition of the potential of both quantum computing and machine learning in the field of document classification, highlighting their promise and relevance in handling large-scale data analysis tasks.
Identification of the lack of research investigating the performance of quantum computing and machine learning techniques specifically on the Arabic language, indicating a research gap to be addressed.
Presentation of a comparative study that explores and compares the performance of quantum computing and machine learning approaches for Arabic document classification using two distinct datasets.
Evaluation of the performance of classic machine learning (ML) and quantum computing approaches in sentiment analysis on a dataset of 213,465 and 44,000 Arabic tweets. The study demonstrates high accuracy achieved by both approaches, with quantum computing slightly outperforming classic ML. The precision, recall, and F1 score metrics are used to evaluate the effectiveness of the approaches in sentiment prediction and the processing times.
Analysis of the metrics to identify the strengths of quantum computing approaches in identifying positive instances and capturing relevant sentiment information in large datasets. Additionally, the study highlights the faster processing times exhibited by traditional machine learning techniques when dealing with smaller dataset sizes.
Provision of valuable insights into the strengths and limitations of quantum computing and machine learning for Arabic document classification, emphasizing the potential of quantum computing in achieving high accuracy, particularly in scenarios where traditional machine learning techniques may face challenges.

The remainder of this paper is organized as follows. In section "Problem statement", we provide an overview of related work on document classification using quantum computing and machine learning. In section "Methodology", we describe the dataset and the experimental setup. In section "Experimental results", we present the results of the comparative study and discuss the implications of the findings. Finally, in section “Discussion and future work”, we conclude the paper and discuss future research directions.

Related work

Meshrif Alruily in³¹ presents a comparison of previous surveys and the need for a comprehensive study on Arabic Tweets. The studies and research on Arabic Tweets are classified according to machine learning algorithms, supervised learning, unsupervised learning, hybrid, and lexicon-based classifications. Their advantages and disadvantages are discussed. This paper also raises different challenges and future research directions in this field. Machine learning algorithms and lexicon-based classifications are considered essential tools for text processing.

In their article, Ghadah Alqahtani and Abdulrahman Alothaim³² discuss the challenges of emotion analysis and classification of tweets on Twitter, which is a highly popular social media platform. They emphasize the difficulty of emotion classification of Arabic tweets, which requires more preprocessing than other languages. The article provides a practical overview and detailed description of materials that can aid in the development of an Arabic language model for emotion classification of Arabic tweets. The authors highlight the use of NLP for emotion classification of Arabic tweets, present an overview of current practical practices and available resources, and propose future research directions by discussing some challenges and issues.

In their article, Nimish Mishra and colleagues³³ explore the merging of quantum computing and classical machine learning into a field known as quantum machine learning. The goal of quantum machine learning is to create faster learning algorithms than those currently available in classical machine learning. While classical machine learning involves identifying patterns in data for predicting future events, quantum systems generate unique patterns that cannot be produced by classical systems. This suggests that quantum computers could potentially outperform classical computers in machine learning tasks. The authors provide a review of previous research in quantum machine learning and present an update on its current status.

Chao-Han Huck Yang et al.³⁴ propose a quantum kernel learning (QKL) framework to tackle data sparsity issues encountered in training large-scale acoustic models in low-resource scenarios. They use classical-to-quantum feature encoding to project acoustic features, and apply QKL with features in the quantum space to design kernel-based classifiers. Unlike existing quantum convolution techniques, their approach utilizes QKL to improve spoken command recognition tasks for low-resource languages, such as Arabic, Georgian, Chuvash, and Lithuanian. Experimental results demonstrate that the proposed QKL-based hybrid approach outperforms existing classical and quantum solutions.

Diksha Sharma et al.³⁵ focuses on the impact of hyperparameters on the performance of classical machine learning models and propose a new quantum kernel method to identify promising hyperparameters for achieving quantum advantages. They analyze and classify sentiments of textual data using a quantum kernel based on linear and fully entangled circuits, which controls the correlation among words and the expressivity of the Quantum Support Vector Machine (QSVM). The authors compare the efficiency of their proposed circuit with other quantum circuits and classical machine learning algorithms and find that their fully entangled circuit outperforms all other circuits and classical algorithms for most features. As the number of features increases, the efficiency of the proposed fully entangled model also increases significantly.

Various enhancements to the Quantum Support Vector Machine (QSVM) have been proposed in the literature. Table 4 provides a summarized comparison among recent variants of QSVMs. Notably, all the quantum versions demonstrate an exponential speed-up compared to classical SVM.

Table 4 Summarized comparison among variants of QSVM algorithms.

Full size table

Table 5 compares quantum computing and machine learning based on their basic concepts, hardware requirements, data representation, speed, scalability, use cases, error correction, programming languages, accessibility, energy efficiency, learning, algorithm complexity, noise sensitivity, research, industry adoption, security, interoperability, community, funding, and future potential is presented in Table 4^40,41,42.

Table 5 Comparison of quantum computing and machine learning.

Full size table

Challenges to using quantum computing and machine learning for Arabic language sentiment classification in social media:

Challenges to using quantum computing and machine learning for Arabic language sentiment classification in social media include^51,52:

1.
Lack of large, high-quality labeled datasets: As mentioned earlier, the lack of large, high-quality labeled datasets for sentiment analysis in Arabic language is a major challenge. This makes it difficult to train accurate machine learning models, and also limits the amount of data that can be used for quantum computing.
2.
Variability in dialects and expressions: Arabic language is spoken by millions of people across different regions, each with their own dialects and expressions. This variability makes it difficult to develop a single sentiment analysis model that can accurately capture the sentiment expressed in all the different forms of Arabic used in social media.
3.
Complexity of Arabic grammar: Arabic language has a complex grammar that includes features such as gender, case, and tense. This complexity can make it difficult to extract sentiment information from text, particularly for quantum computing algorithms that require a simplified representation of the data.
4.
Need for specialized hardware and software: Quantum computing requires specialized hardware and software that is not yet widely available. This can make it difficult for researchers and developers to experiment with and implement quantum computing algorithms for sentiment analysis in Arabic language.
5.
Integration of quantum and classical computing: Combining quantum and classical computing for sentiment analysis in Arabic language requires expertise in both areas. This can make it difficult to find researchers and developers who are skilled in both quantum computing and machine learning.
6.
Interpretability of results: Quantum computing algorithms for sentiment analysis can be difficult to interpret, making it difficult to understand how the sentiment information is being extracted from the data. This can make it difficult to identify and address any biases or errors in the algorithm.
7.
Limited quantum computing resources: The current state of quantum computing hardware and software is still in its early stages of development and is not yet widely available. This limits the amount of data that can be used for quantum computing and can also make it difficult to scale up the size and complexity of the sentiment analysis tasks.
8.
Complexity of quantum algorithms: Quantum computing algorithms for sentiment analysis can be complex and difficult to implement. This requires specialized knowledge and expertise in quantum computing, which may not be widely available.
9.
Need for specialized quantum programming languages: Quantum computing requires a different programming paradigm than classical computing and requires specialized quantum programming languages such as Qiskit or Cirq. This can make it difficult for researchers and developers who are not familiar with these languages to implement quantum algorithms for sentiment analysis.
10.
Noise and errors in quantum computing: Quantum computing is susceptible to noise and errors, which can affect the accuracy and reliability of the sentiment analysis results. This requires specialized techniques such as error correction and fault tolerance, which can be difficult to implement.
11.
Privacy and security concerns: Sentiment analysis in social media involves processing large amounts of personal data, which raises privacy and security concerns. Quantum computing algorithms for sentiment analysis may introduce new security risks, such as the potential for quantum attacks on encryption algorithms.
12.
Ethical considerations: Sentiment analysis can have significant impacts on individuals and society, and raises ethical considerations such as fairness, transparency, and accountability. These considerations must be carefully addressed in the development and implementation of quantum computing and machine learning algorithms for sentiment analysis.
13.
Scalability of machine learning algorithms: Machine learning algorithms for sentiment analysis can require significant computational resources and may not scale well to larger data sets. This requires specialized techniques such as distributed computing and parallel processing, which can be difficult to implement.
14.
Bias and fairness in machine learning: Machine learning algorithms can be susceptible to bias and can produce unfair results, particularly for underrepresented groups. This requires careful attention to data collection, preprocessing, and algorithm design to ensure fairness and avoid bias in the sentiment analysis results.
15.
Integration with existing systems: Sentiment analysis is often integrated with other systems such as social media platforms, customer relationship management (CRM) systems, and marketing automation systems. Integrating quantum computing and machine learning algorithms for sentiment analysis with these existing systems can be a complex and challenging task, requiring specialized knowledge and expertise.

Methodology

QSVM model

The IBM Q Account provides access to cutting-edge cloud-based IBM Q quantum systems and simulators, enabling users to develop, execute, and monitor quantum programs by establishing a reliable connection between Qiskit and quantum devices⁵³. The Qiskit framework is comprised of three key steps: building a quantum circuit to solve a given problem, executing experiments on different backends, and analyzing the results by calculating summary statistics and visualizing the outcomes.

The implementation consists of three basic steps:

Preprocessing that consists of Scaling, normalization and principal component analysis
Generation of kernel matrix
Estimation of the kernel for new set of data points (test data) for QSVM classification.

In the QSVM classification phase, classical SVM is used to generate the separating hyperplane rather than using a quantum circuit and here the quantum computer is used twice. First, the kernel is estimated for all pairs of training data, and the second time the kernel is estimated for a new datum (test data). Least-squares reformulation of the support vector machine is used to change the quadratic programming problem of SVM, into a problem of solving a linear equation system⁵⁴:

$$F\left(\genfrac{}{}{0pt}{}{b}{\overrightarrow{\propto }}\right)\equiv \left(\begin{array}{cc}0& {\overrightarrow{1}}^{T}\\ \overrightarrow{1}& K+{\gamma }^{-1}I\end{array}\right)\left(\genfrac{}{}{0pt}{}{b}{\overrightarrow{\propto }}\right)=\left(\genfrac{}{}{0pt}{}{0}{\overrightarrow{y}}\right)$$

(1)

where, K is m × m kernel matrix and its elements can be calculated by

$$K=K\left({x}_{j},{x}_{k}\right)=\varnothing \left({x}_{j}\right)\cdot \varnothing \left({x}_{k}\right)$$

(2)

Y is a user-defined value to control the trade-off between training error and SVM objective, y is a vector storing the labels of the training data, so the only unknown parameter in the equation is a vector.

After calculating the Kernel matrix on the quantum computer, we can train the Quantum SVM the same way as a classical SVM. Once the parameters of the hyperplane are determined, a new data point x can be classified as

$$y\left({x}_{0}\right)=sgn\left(\sum_{i=0}^{m}{\alpha }_{i}k\left({x}_{i},{x}_{0}\right)+b\right)$$

(3)

where, vector x with i = 1,…,m is the training data, α_i is the i th dimension of the parameter vector α

$$sgn\left(x\right)=\left\{\begin{array}{c}1, if\,\, x\ge 0\\ -1, if \,\,x<0\end{array}\right.$$

(4)

Few important parameters that are specific to the quantum algorithms are⁵³:

feature_dimension: number of features,
depth: the number of repeated circuits,
entangler_map: describe the connectivity of qubits [source, target],
entanglement: generate the qubit connectivity {‘full’- entangles each qubit with all the subsequent ones and ‘linear’ -entangles each qubit with the next}
feature_map(FeatureMap): feature map module to transform the data to feature space,
Datapoints: prediction dataset
quantum_instance (QuantumInstance): quantum backend with all execution settings,
shots: number of repetitions of each circuit,
seed_simulator: random seed for simulators,
seed_transpiler: the random seed for circuit mapper
QSVM: Quantum SVM method that will run the classification algorithm (binary or multiclass)

Table 6 represents the hyperparameters of the Qiskit Aqua machine learning library's QSVM (Quantum Support Vector Machines) model for text classification, including their descriptions and used values.

Table 6 Hyperparameters control various aspects of the QSVM model for text classification.

Full size table

Selecting hyperparameters for the QSVM model in text classification is an iterative process that combines domain knowledge, experimentation, and fine-tuning. It is important to note that finding the optimal hyperparameter configuration for optimal performance often requires multiple iterations. By leveraging domain expertise, testing different hyperparameter values, and refining the choices based on the observed results, the aim is to achieve the best possible performance for the specific text classification problem at hand.

Random Forest model

Random Forest is a popular machine learning algorithm used for classification tasks, including sentiment classification in social media. The main idea behind the Random Forest algorithm is to build many decision trees on different random subsets of the training data, and then combine their predictions to make a final prediction.

The equations for the Random Forest algorithm can be broken down into the following steps:

1.
Data preparation: The social media data is preprocessed to remove noise, stop words, and other unwanted elements. Then, the text data is transformed into a numerical representation, such as a bag-of-words (BoW) or term frequency-inverse document frequency (TF-IDF) matrix.
2.
Building decision trees: Random Forest builds many decision trees on different random subsets of the training data. Each decision tree is built using a subset of features and a subset of training examples. The goal is to create decision trees that have low bias and low variance.
3.
Splitting criteria: At each node of a decision tree, the algorithm selects the best feature and threshold to split the data. The most common splitting criteria are entropy and Gini impurity.
4.
Combining predictions: Once all the decision trees are built, the algorithm combines their predictions to make a final prediction. The most common method for combining the predictions is to use majority voting, where the class with the most votes is selected as the final prediction.

The equations for Random Forest are primarily related to the splitting criteria used to create decision trees. The entropy equation is given by⁵⁵:

$${\text{H}}\left( {\text{S}} \right) \, = \, - \sum {\text{ p}}\left( {\text{i}} \right){\text{ log2}}\left( {{\text{p}}\left( {\text{i}} \right)} \right)$$

(5)

where H(S) is the entropy of a set S, and p(i) is the proportion of examples in S that belong to class i. Entropy measures the impurity of a set, with lower values indicating more purity.

The Gini impurity equation is given by:

$${\text{G}}({\text{S}}) = 1 - \sum {\text{p}}({\text{i}})^{2}$$

(6)

where G(S) is the Gini impurity of a set S, and p(i) is the proportion of examples in S that belong to class i. Gini impurity is another measure of impurity, with lower values indicating more purity.

The Random Forest algorithm is a powerful and flexible technique for sentiment classification in social media, as it can handle large amounts of data and complex feature interactions. This methodology outlines the steps involved in comparing the performance of classic and quantum machine learning algorithms on sentiment analysis of Arabic tweets. In our study, we selected the RF algorithm as one of the ensemble techniques to address the prediction and classification tasks related to our research objective. The nomination of RF was based on its well-established reputation for handling complex datasets, handling high-dimensional features, and providing robust performance in various domains. We believed that the RF algorithm would be well-suited for our problem due to its ability to handle non-linear relationships and capture important feature interactions.

Although RF has been widely utilized in previous studies, its application and evaluation in the specific domain of our research problem is valuable. By including RF as one of the ensemble techniques in our evaluation, we aimed to compare its performance with other algorithms and assess its suitability for our specific dataset and research objective. Therefore, while the RF algorithm itself may not be new, its application and evaluation in our study contribute to the understanding of its effectiveness in addressing our research problem and provide insights into its performance in this specific context.

The hyperparameters for Random Forest in Arabic text classification, along with their values is shown in Table 7.

Table 7 Hyperparameters for random forest in Arabic text classification.

Full size table

The proposed method description

Datasets description:

1.
Data Set: We used two Arabic sentiment datasets of different sizes to investigate the effect of the dataset size.
- The First Dataset: The Arabic dataset for sentiment analysis about Asthma is a collection of 213,465 tweets that have been gathered from Twitter. This dataset has been specifically curated for sentiment analysis, with a focus on the topic of Asthma. The tweets were collected from users who tweeted about Asthma in the Arabic language, and cover a variety of opinions and sentiments related to the topic⁵⁶ and the dataset is available at https://www.kaggle.com/datasets/mtesta010/arabic-asthma-tweets.
- The second dataset: Arabic dataset for sentiment analysis about different topics contain 44,000 posts and tweets collected from Facebook and twitter. The tweets and posts were collected from most visited and fastest growing Facebook pages and Twitter accounts⁵⁷.
2.
Data Preparation: The next step is to preprocess the dataset by removing irrelevant information and cleaning the text. This involves removing any noise, punctuation, and stop words, as well as normalizing the text to ensure consistency.
3.
Feature Representation: The next step is to represent the preprocessed text data in a numerical format that can be used by the machine learning algorithm. The method used in this study is TF-IDF, which represents each tweet as a vector of numerical features. Each feature represents the frequency of a particular word in the tweet, weighted by its inverse document frequency.
4.
Classic Machine Learning: The next step is to apply a classic machine learning algorithm to the preprocessed dataset with the TF-IDF feature representation. In this study, the Random Forest algorithm is used as it is a well-established and popular classification algorithm. Random Forest works by building an ensemble of decision trees and using them to make predictions.
5.
Quantum Machine Learning: The final step is to apply quantum computing to the same classifier and feature representation to show the effect of QC in the classification time and performance. This involves using a quantum algorithm to perform the classification, which has the potential to lead to faster classification times and improved performance compared to the classic machine learning algorithm.
6.
Compute the Performance of the Models: The performance of the classic and quantum machine learning algorithms is compared to determine which approach is more effective for sentiment analysis on the Arabic tweet dataset. Metrics such as accuracy, precision, recall, and F1 score are used to evaluate the performance of the models.

Random Forest is a good choice for comparing classic and quantum machine learning algorithms on sentiment analysis for several reasons. Firstly, it is a well-established and popular classification algorithm that has been shown to work well on a variety of datasets, including text data like tweets. Secondly, it is relatively simple and easy to implement, making it a good choice for comparing the performance of classic and quantum machine learning approaches. Thirdly, it is an ensemble method that can handle high-dimensional data like the TF-IDF feature representation used in this study. This makes it a good choice for comparing the performance of classic and quantum machine learning algorithms on a high-dimensional dataset like the Arabic tweet dataset. Finally, Random Forest has been shown to be robust to noise and outliers, which is important when working with real-world data that may contain noise or errors.

The steps of the classification steps are summarized in Fig. 3.

The steps (pseudocode) for quantum computing in text classification task are:

Performance evaluation

The performance of the proposed method can be measured using well-known evaluation metrics—the accuracy of the classification, precision, recall, and F1 scores. These metrics are based on a “confusion matrix” that includes true positives (TP), true negatives (TN), false positives (FP), and false negatives (FN)^51,58.

Accuracy is a measure of the overall correctness of a classification model. It calculates the ratio of correctly predicted instances to the total number of instances in the dataset.

$$\mathrm{Accuracy}=\frac{\mathrm{TP }+\mathrm{ TN}}{\mathrm{TP }+\mathrm{ FP }+\mathrm{ TN }+\mathrm{ FN}}$$

(7)

Precision is a measure of the accuracy of positive predictions made by a classification model. It calculates the ratio of correctly predicted positive instances to the total number of instances predicted as positive.

$$\mathrm{Precision}=\frac{\mathrm{TP }}{\mathrm{TP }+\mathrm{ FP}}$$

(8)

Recall is a measure of the model's ability to correctly identify all relevant instances in the dataset. It calculates the ratio of correctly predicted positive instances to the total number of actual positive instances.

$$\mathrm{Recall}=\frac{\mathrm{TP}}{\mathrm{TP }+\mathrm{ FN}}$$

(9)

The F1 score is the harmonic mean of precision and recall. It provides a balanced measure of a model's performance by considering both false positives and false negatives.

$$\mathrm{F}1-\mathrm{score}=2*\frac{(\mathrm{Precision }\times \mathrm{ Recall})}{(\mathrm{Precision}+\mathrm{ Recall})}$$

(10)

Ethical statement

This study involved only secondary data analysis of publicly available data and did not involve any human subjects or animals. As such, it was exempt from ethical approval under the guidelines of the Minia University Ethics Committee. All data used in this study were publicly available and did not contain any identifiable information about individuals. The study was conducted in compliance with all relevant regulations and guidelines.

Informed consent

Informed consent was obtained from all individual participants included in the study.

Experimental results

Our model was implemented and tested successfully using Kaggle's powerful cloud-based platform, leveraging its extensive resources and collaborative features for efficient machine learning development. Comparison of Classic and Quantum Machine Learning Techniques for Sentiment Analysis on the first Arabic Tweet Dataset. Table 8 presents a comparison of the performance of classic machine learning (ML) and quantum computing approaches for sentiment analysis on the first dataset. The evaluation metrics used are time, accuracy, precision, recall, and F1 score.

Table 8 Comparison of classic and quantum machine learning for the first dataset.

Full size table

Figure 4 shows time consumed comparison and Fig. 5 shows performance comparison for the first dataset.

The results show that both classic ML and quantum computing approaches achieve high accuracy in sentiment analysis on the First dataset. The classic ML approach takes 3637.96 s (approximately 1 h) to complete the sentiment analysis task, while the quantum computing approach takes slightly less time, at 3577.54 s (approximately 59 min). This indicates that quantum computing can potentially lead to faster classification times, which is a promising result.

In terms of accuracy, the quantum computing approach achieves a slightly higher accuracy of 0.8199 compared to the classic ML approach, which achieves an accuracy of 0.8175. However, the difference in accuracy is relatively small and may not be significant in practice.

The precision, recall, and F1 score metrics show that both classic ML and quantum computing approaches perform similarly in sentiment analysis on the Arabic tweet dataset. The precision metric measures the proportion of true positive predictions out of all positive predictions. The recall metric measures the proportion of true positive predictions out of all actual positive instances. The F1 score is the harmonic mean of precision and recall, and it provides a balanced measure of both metrics.

The precision, recall, and F1 score values for the classic ML approach are 0.8215, 0.8175, and 0.8121, respectively. The precision, recall, and F1 score values for the quantum computing approach are 0.8239, 0.8199, and 0.8147, respectively. These values show that both classic ML and quantum computing approaches achieve high precision, recall, and F1 scores, indicating that they are effective at predicting sentiment in Arabic tweets.

The results in Table 9 suggest that quantum computing has the potential to improve the speed of sentiment analysis on Arabic tweets. However, the difference in accuracy between classic ML and quantum computing approaches is relatively small, and both approaches perform similarly in terms of precision, recall, and F1 score. Further research is needed to explore the potential benefits of quantum computing in sentiment analysis on other datasets and to investigate the scalability of quantum machine learning algorithms for larger datasets.

Table 9 Comparison of classic and quantum machine learning for the second dataset.

Full size table

Figure 6 shows time consumed comparison and Fig. 7 shows performance comparison for the second dataset.

In the case of the second dataset with 44,000 tweets, both classic ML (Random Forest) and quantum computing demonstrate significantly reduced processing times compared to the first dataset. Classic ML takes only 114.9 s (approximately 2 min) to complete, while quantum computing takes 112.7 s (approximately 1 min and 53 s). These results suggest that both techniques are highly efficient when dealing with a smaller dataset, and the difference in processing time between them is minimal.

Accuracy: Classic ML (RF) achieves a higher accuracy of 0.9241, while quantum computing achieves an accuracy of 0.9205. Classic ML maintains a slightly better accuracy in this case, but the difference remains small, indicating that both techniques perform well in accurately predicting sentiment for this dataset. Precision, Recall, and F1 Score: Both classic ML and quantum computing approaches maintain high precision, recall, and F1 scores. The precision, recall, and F1 score values for classic ML are 0.9286, 0.9241, and 0.9249, respectively. For quantum computing, the values are 0.92456, 0.9205, and 0.9214, respectively. These metrics suggest that both approaches are effective at identifying positive instances and capturing relevant sentiment information in the dataset.

Limitations

There are certain limitations to adopting quantum computing and machine learning for Arabic language sentiment classification in social media^51,52:

1.
Limited availability of quantum computing hardware: The current state of quantum computing hardware is still in its early stages of development and is not yet widely available. This can limit the amount of data that can be used for quantum computing, and can also make it difficult to scale up the size and complexity of the sentiment analysis tasks.
2.
Limited availability of quantum computing software: Quantum computing requires specialized software that is not yet widely available. This can make it difficult for researchers and developers to experiment with and implement quantum computing algorithms for sentiment analysis in Arabic language.
3.
Complexity of quantum algorithms: Quantum computing algorithms for sentiment analysis can be complex and difficult to implement. This requires specialized knowledge and expertise in quantum computing, which may not be widely available.
4.
Need for specialized quantum programming languages: Quantum computing requires a different programming paradigm than classical computing, and requires specialized quantum programming languages such as Qiskit or Cirq. This can make it difficult for researchers and developers who are not familiar with these languages to implement quantum algorithms for sentiment analysis.
5.
Variability in dialects and expressions: Arabic language is spoken by millions of people across different regions, each with their own dialects and expressions. This variability makes it difficult to develop a single sentiment analysis model that can accurately capture the sentiment expressed in all the different forms of Arabic used in social media.
6.
Complexity of Arabic grammar: Arabic language has a complex grammar that includes features such as gender, case, and tense. This complexity can make it difficult to extract sentiment information from text, particularly for quantum computing algorithms that require a simplified representation of the data.
7.
Noise and errors in quantum computing: Quantum computing is susceptible to noise and errors, which can affect the accuracy and reliability of the sentiment analysis results. This requires specialized techniques such as error correction and fault tolerance, which can be difficult to implement.
8.
Integration with existing systems: Sentiment analysis is often integrated with other systems such as social media platforms, customer relationship management (CRM) systems, and marketing automation systems. Integrating quantum computing and machine learning algorithms for sentiment analysis with these existing systems can be a complex and challenging task, requiring specialized knowledge and expertise.
9.
Limited to two datasets of Arabic tweets—results may not generalize to other types/sizes of Arabic language datasets.
10.
Only considered sentiment analysis task—performance could differ for other document classification problems.
11.
Quantum computing technique used was a generic approach—tailored quantum algorithms may improve results.
12.
Processing times may vary with different hardware/system specifications.

Discussion and future work

Sentiment analysis in Arabic language social media is a challenging task due to the complexity of the Arabic language, the variability in dialects and expressions, and the lack of large, high-quality labeled datasets. Quantum computing and machine learning offer promising solutions for sentiment analysis in Arabic language social media, but there are several challenges that must be carefully considered and addressed.

The exponential growth of digital data generated by Arabic speakers has created a pressing need for effective and efficient document classification techniques. While both quantum computing and machine learning have shown promise in this field, there is a noticeable lack of research exploring their performance specifically on the Arabic language. This paper aims to address this gap by conducting a comparative study of quantum computing and machine learning techniques for Arabic document classification, utilizing two distinct datasets.

The first dataset comprises 213,465 Arabic tweets, and it serves as the basis for sentiment analysis. Both classical machine learning (ML) and quantum computing approaches demonstrate high accuracy in sentiment prediction. Quantum computing slightly outperforms classical ML, achieving an accuracy rate of approximately 82.39%, while classical ML achieves an accuracy rate of 82.15%. The computational time for quantum computing is approximately 59 min, slightly faster than the one-hour processing time of classical ML. Precision, recall, and F1 score metrics further validate the effectiveness of both approaches in predicting sentiment in Arabic tweets. Classical ML exhibits precision, recall, and F1 score values of 0.8215, 0.8175, and 0.8121, respectively, while quantum computing achieves values of 0.8239, 0.8199, and 0.8147, respectively.

Moving on to the second dataset, which consists of 44,000 tweets, both classical ML (specifically, the Random Forest algorithm) and quantum computing demonstrate significantly reduced processing times compared to the first dataset. However, there is no substantial difference in processing time between the two approaches. Classical ML completes the analysis in approximately 2 min, whereas quantum computing takes around 1 min and 53 s. In terms of accuracy, classical ML achieves a slightly higher rate of 92.41%, compared to 92.05% for quantum computing. Nevertheless, both approaches achieve high precision, recall, and F1 scores, indicating their effectiveness in accurately predicting sentiment in the dataset. Classical ML attains precision, recall, and F1 score values of 0.9286, 0.9241, and 0.9249, respectively, while quantum computing achieves values of 0.92456, 0.9205, and 0.9214, respectively.

The analysis of these metrics indicates that quantum computing approaches are particularly effective in identifying positive instances and capturing relevant sentiment information in large datasets, as demonstrated by the first dataset. On the other hand, traditional machine learning techniques exhibit faster processing times when dealing with smaller dataset sizes, as observed in the second dataset. These findings shed light on the strengths and limitations of quantum computing and machine learning for Arabic document classification. They highlight the potential of quantum computing in achieving high accuracy, particularly in scenarios where traditional machine learning techniques may face challenges.

This study contributes valuable insights to the development of more accurate and efficient document classification systems for Arabic data. By showcasing the advantages and trade-offs of both quantum computing and classical machine learning, it lays the groundwork for future research and encourages the exploration of quantum computing techniques in Arabic text analysis. Ultimately, the findings of this study have the potential to enhance the accuracy and efficiency of document classification systems for Arabic speakers, thus addressing the growing need for effective language processing tools in the digital era.

There are a list of potential areas for future work in quantum computing and machine learning for Arabic language sentiment classification in social media:

1.
Developing larger, high-quality labeled datasets for sentiment analysis in Arabic language, which can be used to train and evaluate machine learning models and to test quantum computing algorithms.
2.
Developing specialized quantum computing algorithms for sentiment analysis in Arabic language that can handle the complexity of Arabic grammar, the variability in dialects and expressions, and the noise and errors that are inherent in quantum computing.
3.
Developing more efficient quantum computing algorithms for sentiment analysis that can handle larger data sets, and that can be implemented on current and future quantum computing hardware.
4.
Integrating quantum computing algorithms with existing machine learning models and systems for sentiment analysis, to leverage the strengths of both approaches.
5.
Developing new techniques for error correction and fault tolerance in quantum computing algorithms for sentiment analysis, to improve the accuracy and reliability of the results.
6.
Developing new techniques for data preprocessing and feature selection in machine learning algorithms for sentiment analysis, to improve the accuracy and efficiency of the models.
7.
Developing new techniques for handling bias and fairness in machine learning algorithms for sentiment analysis, to ensure that the results are fair and unbiased for all groups.
8.
Developing new techniques for handling privacy and security concerns in sentiment analysis, particularly with respect to the processing of personal data.
9.
Developing new techniques specifically tailored to address the challenges associated with imbalanced data.
10.
Developing new techniques for handling variability in dialects and expressions in Arabic language sentiment analysis, to improve the accuracy and relevance of the results.
11.
Developing new techniques for handling the complexity of Arabic grammar in sentiment analysis, to improve the accuracy and relevance of the results.
12.
Developing new techniques for sentiment analysis across multiple languages, to enable cross-lingual sentiment analysis and to improve the accuracy and relevance of the results.
13.
Developing new techniques for real-time sentiment analysis in social media, to enable real-time monitoring and response to changes in sentiment.
14.
Developing new techniques for sentiment analysis in multimedia content, such as images and videos, to enable analysis of sentiment in non-textual content.
15.
Developing new techniques for sentiment analysis in specific domains, such as politics, sports, or entertainment, to enable more targeted and relevant analysis of sentiment.
16.
Developing new techniques for sentiment analysis in specific social media platforms, such as Twitter, Facebook, or Instagram, to enable more targeted and relevant analysis of sentiment.

Developing new techniques for sentiment analysis that take into account the context and cultural background of the users, to improve the accuracy and relevance of the results.

Conclusion

Using quantum computing and machine learning offers promising solutions for Arabic language sentiment classification in social media. However, there are several challenges that must be carefully considered and addressed in the implementation of these technologies, such as the lack of large, high-quality labeled datasets, the complexity of Arabic grammar, the variability in dialects and expressions, and the need for specialized hardware and software for quantum computing. Both machine learning and quantum computing show promise for Arabic document classification, but research in this area is limited. This study comparatively evaluated the two approaches on sentiment analysis of Arabic tweets. For the larger dataset of 213K tweets, both techniques achieved high accuracy, with quantum computing performing slightly better. Quantum computing was also slightly faster. Metrics indicated both effectively predicted sentiment. On the smaller dataset of 44K tweets, processing times significantly reduced for both approaches, with no difference between them. Classic machine learning achieved slightly higher accuracy but similar metric scores to quantum computing. While traditional machine learning was faster on smaller datasets, quantum computing effectively identified positive instances and captured sentiment information in large datasets. The findings provide insights into the strengths and limitations of each technique. Quantum computing demonstrates high accuracy even with difficulties in traditional machine learning. This research contributes to more accurate and efficient systems for classifying Arabic data.

Data availability

The dataset used in this study is public and all test data are available at this portal (https://www.kaggle.com/datasets/mtesta010/arabic-asthma-tweets).

References

Muaad, A. Y., et al. Arabic document classification: Performance investigation of preprocessing and representation techniques. Math. Probl. Eng. https://doi.org/10.1155/2022/3720358 (2022).
Alsayat, A. & Ahmadi, H. A hybrid method using ensembles of neural network and text mining for learner satisfaction analysis from big datasets in online learning platform. Neural Process. Lett. 55(3), 3267–3303. https://doi.org/10.1007/s11063-022-11009-y (2022).
Article Google Scholar
Alsayat, A. Improving sentiment analysis for social media applications using an ensemble deep learning language model. Arab. J. Sci. Eng. 47(2), 2499–2511 (2022).
Article PubMed Google Scholar
Al-Hashedi, A., et al. Ensemble classifiers for arabic sentiment analysis of social network (twitter data) towards COVID-19-related conspiracy theories. Appl. Comput. Intell. Soft Comput. https://doi.org/10.1155/2022/6614730 (2022).
Ganguly, S., Morapakula, S. N., & Coronado, L. M. P. Quantum natural language processing based sentiment analysis using lambeq toolkit. In ICPC2T 2022 - 2nd International Conference on Power, Control and Computing Technologies, Proceedings, no. June. https://doi.org/10.1109/ICPC2T53885.2022.9776836 (2022).
Mostafa, A. M., Aljasir, M., Alruily, M., Alsayat, A. & Ezz, M. Innovative Forward fusion feature selection algorithm for sentiment analysis using supervised classification. Appl. Sci. 13(4), 1. https://doi.org/10.3390/app13042074 (2023).
Article CAS Google Scholar
Jiang, S., Hu, J., Magee, C. L. & Luo, J. Deep learning for technical document classification. IEEE Trans. Eng. Manag. 1, 1–17. https://doi.org/10.1109/TEM.2022.3152216 (2022).
Article Google Scholar
Article, F. L. et al. Speech communication arabic toxic tweet classification using the AraBERT model.
de Leon, N. P. et al. Materials challenges and opportunities for quantum computing hardware. Science 372(6539), 2823 (2021).
Article ADS Google Scholar
Fuquan, Z. The opportunities and challenges of quantum computing. Biomed. J. Sci. Tech. Res. 6(3), 5–7. https://doi.org/10.26717/bjstr.2018.06.001360 (2018).
Article Google Scholar
Sajwan, P., & Jayapandian, N. Challenges and opportunities: Quantum computing in machine learning. In 2019 Third International conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC), pp. 598–602 (2019).
Joshi, M., Karthikeyan, S., & Mishra, M. K. Recent trends and open challenges in blind quantum computation. In Advanced Network Technologies and Intelligent Computing, Springer Nature Switzerland, pp. 485–496. https://doi.org/10.1007/978-3-031-28183-9_34 (2023).
Ajagekar, A. & You, F. New frontiers of quantum computing in chemical engineering. Kor. J. Chem. Eng. 39(4), 811–820. https://doi.org/10.1007/s11814-021-1027-6 (2022).
Article CAS Google Scholar
Ramezani, S. B., Sommers, A., Manchukonda, H. K., Rahimi, S., & Amirlatifi, A. Machine learning algorithms in quantum computing: A survey. In Proceedings of the International Joint Conference on Neural Networks, no. 2. https://doi.org/10.1109/IJCNN48605.2020.9207714 (2020).
Cirac, J. I. & Zoller, P. A scalable quantum computer with ions in an array of microtraps. Nature 404(6778), 579–581. https://doi.org/10.1038/35007021 (2000).
Article ADS CAS PubMed Google Scholar
Rosch-Grace, D., & Straub, J. Analysis of the likelihood of quantum computing proliferation. In Technology in Society, vol. 68, no. August 2021, p. 101880. https://doi.org/10.1016/j.techsoc.2022.101880 (2022).
Chen, B. Q. & Niu, X. F. A novel neural network based on quantum computing. Int. J. Theor. Phys. 59(7), 2029–2043. https://doi.org/10.1007/s10773-020-04475-4 (2020).
Article MathSciNet MATH Google Scholar
Cerezo, M., Verdon, G., Huang, H. Y., Cincio, L. & Coles, P. J. Challenges and opportunities in quantum machine learning. Nat. Comput. Sci. 2(9), 567–576. https://doi.org/10.1038/s43588-022-00311-3 (2022).
Article Google Scholar
Liu, H. et al. Prospects of quantum computing for molecular sciences. Mater. Theory 6(1), 1. https://doi.org/10.1186/s41313-021-00039-z (2022).
Article ADS Google Scholar
Daley, A. J. et al. Practical quantum advantage in quantum simulation. Nature 607(7920), 667–676. https://doi.org/10.1038/s41586-022-04940-6 (2022).
Article ADS CAS PubMed Google Scholar
Gupta, A. & Kumar, A. Human decisions and machine predictions. Asian-Eur. J. Math. 12(05), 1950084. https://doi.org/10.1142/S1793557119500840 (2019).
Article MathSciNet MATH Google Scholar
Rifaioglu, A. S. et al. Recent applications of deep learning and machine intelligence on in silico drug discovery: Methods, tools and databases. Brief. Bioinf. 20(5), 1878–1912. https://doi.org/10.1093/bib/bby061 (2019).
Article CAS Google Scholar
Khakpour, A., & Colomo-Palacios, R. Convergence of gamification and machine learning: A systematic literature review, vol. 26, no. 3. Springer Netherlands. https://doi.org/10.1007/s10758-020-09456-4 (2021).
Balasubramanian, N., Ye, Y. & Xu, M. Substituting human decision-making with machine learning: Implications for organizational learning. Acad. Manag. Rev. 47(3), 448–465 (2022).
Article Google Scholar
Chen, F., Cao, Z., Grais, E. M. & Zhao, F. Contributions and limitations of using machine learning to predict noise-induced hearing loss. Int. Arch. Occup. Environ. Health 94(5), 1097–1111. https://doi.org/10.1007/s00420-020-01648-w (2021).
Article PubMed PubMed Central Google Scholar
Sáez, C., Romero, N., Conejero, J. A. & García-Gómez, J. M. Potential limitations in COVID-19 machine learning due to data source variability: A case study in the nCov2019 dataset. J. Am. Med. Inf. Assoc. 28(2), 360–364. https://doi.org/10.1093/jamia/ocaa258 (2021).
Article Google Scholar
McCradden, M. D., Joshi, S., Mazwi, M. & Anderson, J. A. Ethical limitations of algorithmic fairness solutions in health care machine learning. Lancet Digital Health 2(5), e221–e223. https://doi.org/10.1016/S2589-7500(20)30065-0 (2020).
Article PubMed Google Scholar
Lones, M. A. How to avoid machine learning pitfalls: A guide for academic researchers. pp. 1–25 (2021).
“Just How Much Better is Quantum Machine Learning than its Classical Counterpart?,” QuBytes, Feb. 12, 2021. https://qubytes.org/2021/02/11/just-how-much-better-is-quantum-machine-learning-than-its-classical-counterpart/ (accessed Sep. 08, 2023).
Zeguendry, A., Jarir, Z. & Quafafou, M. Quantum machine learning: A review and case studies. Entropy 25(2), 287 (2023).
Article ADS MathSciNet PubMed PubMed Central Google Scholar
Alruily, M. Classification of Arabic tweets: A review. Electronics 10(10), 1. https://doi.org/10.3390/electronics10101143 (2021).
Alqahtani, G. & Alothaim, A. Emotion analysis of Arabic tweets: Language models and available resources. Front. Artif. Intell. 5(March), 1–11. https://doi.org/10.3389/frai.2022.843038 (2022).
Article Google Scholar
Aljunid, M. F., & Manjaiah, D. H. Quantum machine learning: A review and current status, vol. 70. In Advances in Intelligent Systems and Computing, vol. 70. Singapore: Springer Singapore. https://doi.org/10.1007/978-981-15-5619-7 (2021).
Yang, C. H., et al. A quantum kernel learning approach to acoustic modeling. pp. 2–6. https://doi.org/10.48550/arXiv.2211.01263 (2022).
Sharma, D., Singh, P., & Kumar, A. The role of entanglement for enhancing the efficiency of quantum kernels towards classification. arXiv.org, pp. 1–12 (2023).
Li, Y., Zhou, R.-G., Xu, R., Luo, J. & Jiang, S.-X. A quantum mechanics-based framework for EEG signal feature extraction and classification. IEEE Trans. Emerg. Top. Comput. 10(1), 211–222 (2020).
Article Google Scholar
Yang, J., Awan, A. J., & Vall-Llosera, G. Support vector machines on noisy intermediate scale quantum computers. arXiv preprint arXiv:1909.11988 (2019).
Lin, J. et al. Quantum-enhanced least-square support vector machine: Simplified quantum algorithm and sparse solutions. Phys. Lett. A 384(25), 126590 (2020).
Article MathSciNet CAS MATH Google Scholar
Jadhav, A., Rasool, A. & Gyanchandani, M. Quantum machine learning: Scope for real-world problems. Proc. Comput. Sci. 218, 2612–2625 (2023).
Article Google Scholar
Bhattacharyya, S. et al. Quantum machine learning. Quant. Mach. Learn. 1, 1–120. https://doi.org/10.1515/9783110670707 (2020).
Article Google Scholar
Stone, P. Encyclopedia of machine learning and data mining. Encyclop. Mach. Learn. Data Min. 19, 89. https://doi.org/10.1007/978-1-4899-7687-1 (2017).
Article Google Scholar
García, D. P., Cruz-Benito, J., & García-Peñalvo, F. J. Systematic literature review: Quantum machine learning and its applications, vol. 8329, pp. 0–3 (2022).
Winker, T. et al. Quantum machine learning: Foundation, new techniques, and opportunities for database research. Comp. Int. Conf. Manag. Data 2023, 45–52 (2023).
Google Scholar
Yi, H. Machine learning method with applications in hardware security of post-quantum cryptography. J. Grid Comput. 21(2), 19 (2023).
Article Google Scholar
Engelsberger, A. & Villmann, T. Quantum computing approaches for vector quantization—current perspectives and developments. Entropy 25(3), 540 (2023).
Article ADS MathSciNet PubMed PubMed Central Google Scholar
Tychola, K. A., Kalampokas, T. & Papakostas, G. A. Quantum machine learning—an overview. Electronics 12(11), 2379 (2023).
Article Google Scholar
De Luca, G., & Chen, Y. Teaching quantum machine learning in computer science. In 2023 IEEE 15th International Symposium on Autonomous Decentralized System (ISADS), IEEE, pp. 1–7 (2023).
Bhowmik, B. R., & Manjunath, T. D. Quantum learning and its related applications for the future. In Handbook of Research on Quantum Computing for Smart Environments, IGI Global, pp. 25–47 (2023).
Giuntini, R. et al. Quantum-inspired algorithm for direct multi-class classification. Appl. Soft Comput. 134, 109956 (2023).
Article Google Scholar
Said, D. Quantum computing and machine learning for cybersecurity: Distributed denial of service (DDoS) attack detection on smart micro-grid. Energies 16(8), 3572 (2023).
Article Google Scholar
Omar, A., Mahmoud, T. M., & Abd-El-Hafeez, T. Comparative Performance of Machine Learning and Deep Learning Algorithms for Arabic hate speech Detection in OSNs, vol. 1. Springer International Publishing. https://doi.org/10.1007/978-3-030-44289-7 (2020).
Alrefai, M., Faris, H. & Aljarah, I. Sentiment analysis for Arabic language: A brief survey of approaches and techniques. Int. J. Adv. Sci. Technol. 119(September), 13–24. https://doi.org/10.14257/ijast.2018.119.02 (2018).
Article Google Scholar
Ruskanda, F. Z. et al. Quantum representation for sentiment classification. IEEE Int. Conf. Quant. Comput. Eng. (QCE) 2022, 67–78. https://doi.org/10.1109/QCE53715.2022.00025 (2022).
Article Google Scholar
Kavitha, S. S., & Kaulgud, N. Quantum machine learning for support vector machine classification. Evol. Intell. 0123456789. https://doi.org/10.1007/s12065-022-00756-5 (2022).
Liu, X., Liu, X., Lai, Y., Yang, F., & Zeng, Y. Random decision DAG: An entropy based compression approach for random forest, vol. 11448 LNCS. Springer International Publishing. https://doi.org/10.1007/978-3-030-18590-9_37 (2019).
Alotaibi, M. & Omar, A. An investigation of asthma experiences in Arabic communities through twitter discourse. Int. J. Adv. Comput. Sci. Appl. 14(5), 460–469. https://doi.org/10.14569/IJACSA.2023.0140549 (2023).
Article Google Scholar
Omar, A., Mahmoud, T. M., Abd-El-Hafeez, T. & Mahfouz, A. Multi-Label Arabic text classification and hate speech detection in online social networks. Inf. Process. Manag. 1, 1 (2020).
Google Scholar
Omar, A., Mahmoud, T. M., Abd-El-Hafeez, T. & Mahfouz, A. Multi-label Arabic text classification in online social networks. Inf. Syst. 100, 101785. https://doi.org/10.1016/j.is.2021.101785 (2021).
Article Google Scholar

Download references

Acknowledgements

Authors sincerely acknowledge Computer Science Department in Faculty of Science, Minia University for the facilities and support.

Funding

Open access funding provided by The Science, Technology & Innovation Funding Authority (STDF) in cooperation with The Egyptian Knowledge Bank (EKB).

Author information

Authors and Affiliations

Department of Computer Science, Faculty of Science, Minia University, EL-Minia, Egypt
Ahmed Omar & Tarek Abd El-Hafeez
Computer Science Unit, Deraya University, EL-Minia, Egypt
Tarek Abd El-Hafeez

Authors

Ahmed Omar
View author publications
You can also search for this author in PubMed Google Scholar
Tarek Abd El-Hafeez
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

This work was carried out in collaboration among all authors. All Authors designed the study, performed the statistical analysis and wrote the protocol. Authors A.O. and T.A.E.H. managed the analyses of the study, managed the literature searches and wrote the first draft of the manuscript. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Ahmed Omar or Tarek Abd El-Hafeez.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Omar, A., Abd El-Hafeez, T. Quantum computing and machine learning for Arabic language sentiment classification in social media. Sci Rep 13, 17305 (2023). https://doi.org/10.1038/s41598-023-44113-7

Download citation

Received: 29 April 2023
Accepted: 03 October 2023
Published: 12 October 2023
DOI: https://doi.org/10.1038/s41598-023-44113-7

This article is cited by

Knowledge-enhanced graph convolutional networks for Arabic aspect sentiment classification
- Rajae Bensoltane
- Taher Zaki
Social Network Analysis and Mining (2023)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

Character gated recurrent neural networks for Arabic sentiment analysis

Rise and fall of the global conversation and shifting sentiments during the COVID-19 pandemic

On the use of aspect-based sentiment analysis of Twitter data to explore the experiences of African Americans during COVID-19

Introduction

Problem statement

Related work

Methodology

QSVM model

Random Forest model

The proposed method description

Performance evaluation

Ethical statement

Informed consent

Experimental results

Limitations

Discussion and future work

Conclusion

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Knowledge-enhanced graph convolutional networks for Arabic aspect sentiment classification

Comments

Search

Quick links