Dimensionality reduction of independent influence factors in the objective evaluation of quality of experience

Big Data analytics and Artificial Intelligence (AI) technologies have become the focus of recent research due to the large amount of data. Dimensionality reduction techniques are recognized as an important step in these analyses. The multidimensional nature of Quality of Experience (QoE) is based on a set of Influence Factors (IFs) whose dimensionality is preferable to be higher due to better QoE prediction. As a consequence, dimensionality issues occur in QoE prediction models. This paper gives an overview of the used dimensionality reduction technique in QoE modeling and proposes modification and use of Active Subspaces Method (ASM) for dimensionality reduction. Proposed modified ASM (mASM) uses variance/standard deviation as a measure of function variability. A straightforward benefit of proposed modification is the possibility of its application in cases when discrete or categorical IFs are included. Application of modified ASM is not restricted to QoE modeling only. Obtained results show that QoE function is mostly flat for small variations of input IFs which is an additional motive to propose a modification of the standard version of ASM. This study proposes several metrics that can be used to compare different dimensionality reduction approaches. We prove that the percentage of function variability described by an appropriate linear combination(s) of input IFs is always greater or equal to the percentage that corresponds to the selection of input IF(s) when the reduction degree is the same. Thus, the proposed method and metrics are useful when optimizing the number of IFs for QoE prediction and a better understanding of IFs space in terms of QoE.

(3) Introduction of new metrics for quantitative measure of the amount of function variability of the linear combination of IFs to compare feature selection and feature extraction approach; (4) Numerical analysis with performance analysis of ASM and mASM.
To accomplish the given objectives, we organize the rest of the paper as follows. Section "Releated work" presents the related work considering dimensionality reduction in QoE modeling. Section "Problem statement" gives a problem statement, section "Overview of existing method and metrics" the theoretical background, and section "Introduction of new method and metrics" the mathematical model of the method and metrics introduced in this paper. Section "Numerical analysis" presents the numerical results obtained for ASM and mASM method. Section "Discussion" discusses obtained results and outlines recommendations for future work. Section "Conclusion" concludes the paper.

Scientific Reports
| (2022) 12:10320 | https://doi.org/10.1038/s41598-022-13803-z www.nature.com/scientificreports/ methods are divided into two groups, methods based on subsets and methods based on subspaces (i.e. selection and extraction). Manifold learning and embedded learning are synonyms for feature extraction.  31 . The main objective of feature extraction is to get subspaces where more inference and efficient learning can be obtained 30 . Feature selection includes (1) Filter methods, (2) Wrapper methods, and (3) Embedded methods 32 . Filter methods rank features using criteria. Wrapper methods take advantage of a learning algorithm as a part of the feature selection. Embedded methods combine the qualities of filter and wrapper methods. According to listed dimensionality reduction methods, first an overview of which methods are suitable for an independent set of IFs is given, and then an overview of the application of all the above methods to the input set of IFs in QoE modeling is given. Dimensionality reduction methods appropriate for an independent set of IFs. Since listed feature extraction methods are based on assumptions that features are not uncorrelated and that they share some information 33 , it is necessary to examine whether these methods are appropriate for application to an independent set of input IFs in QoE modeling. Unsupervised feature extraction methods seek for correlation between input IFs to perform the reduction. PCA is a method where linear transformations of correlated variables are generated to produce relatively uncorrelated variables 11 . Other methods are highly interrelated and in special cases equivalent to PCA 34 so it explains the initial statement that these methods are introduced to remove redundancy in input data 33 . LSA is designed for text documents with the aim to learn text semantic representation. PP predefines the objective function called projection pursuit index and projection is done by maximizing this function. ICA performs linear transformation of input correlated data in such a way that outputs are independent. LPP relies on the linear approximation of the Laplacian Eigenmaps with an aim to preserve distances between samples when projecting data to lower space. Laplacian Eigenmap uses similarity of neighbour samples. KPCA is non-linear PCA so first mapping the data is done using a nonlinear function 35 . MDS uses distance metric so it becomes PCA when Euclid distance is used. Isomap and LLE outweigh the disadvantage of PCA of not capturing the possible non-linear essence of pattern. According to this, Isomap, LLE, and Laplacian Eigenmaps can be considered as special cases of KPCA, whereas KPCA is identical to PCA when linear kernel is used.
Since dimensionality reduction of an independent set of IFs requires information of QoE change, supervised methods are appropriate for this task. Supervised methods LDA and LVQ are applicable to an independent set of inputs and can be used for weight determination. Besides dimensionality reduction, LDA is a classification approach, and it relies on the mean of samples and covariance matrices computed from training sample from different groups 36 . LDA determines weighted coefficients for inputs in a way to give the best separations between known groups of observations. According to this, application LDA and ASM on the input set of IFs gives different weighted coefficients with different application possibilities. LDA uses classes of QoE and transformation of original space of input IFs is made in a way to get a projection maximizing the ratio between different QoE classes while minimizing the ratio within QoE classes. So, besides dimensionality reduction (data preparation), LDA is used for data modeling. ASM, in contrast, sees QoE as a function of input IFs where QoE values need not be a constraint to a limited set of values and transformation of original space of input IFs is made in a way that the first dimension contains the highest change of QoE, each following dimension is less important, and that for the last dimensions there are no major changes of the QoE function or the changes are zero. So, the basic disadvantage of LDA in comparison to ASM is that LDA is limited to classification problems only. LDA does not address the problem of continuous target variables, so these techniques are not applicable to the family of regression problems 37 . Similar to LDA, LVQ also performs classification using distances between input vectors and its advantage is classification accuracy 31 , but it retains the same disadvantage as LDA in comparison to ASM. According to the above, application of ASM in the Data preparation phase enables selection of reduction degree according to desired accuracy and acceptable complexity, a better understanding of input IFs with no restriction later to Data modeling classification methods only.

Applied dimensionality reduction methods on the input set of IFs. Neglecting approaches where
IFs are ignored based on experience, according to above-mentioned methods for feature extraction and feature selection, Table 1 depicts related studies that have used dimensionality reduction methods for data preparation in the process of modeling perceived QoE. From references review (Table 1), the following can be noticed: • Dimension of IF vector ranges from 3 to 5200, so it can be concluded that the application of the dimensionality reduction technique is not limited by the dimension of the input vector (which can be subject to further consideration depending on the type of dimensionality reduction technique  [51][52][53] . There are some special application aims such as determining the weighted coefficients to be used within the mathematical model 50 . The combined application of the feature selection and feature extraction method is given in 53 where LDA is used for QoE prediction. • The achieved degree of dimensional reduction after the Data preparation phase ranges from 99.6% (input: 5200, output: 20) 47 to the example where the dimensionality reduction technique was not applied to achieve the reduction (0% reduction) 50 . The reduction degree that the dimensionality reduction technique can accomplish, is essentially conditioned by the selection of the input IFs, where the selection of IFs can be such that more IFs describe the same change in the system, so they are highly interdependent.

Problem statement
For the formal introduction of methods and metrics, general notation is as follows. 1] N and zero elsewhere. The mean and variance of function f are given by . . , N}\{i} are fixed. The subscript (x i ) represents i-th input whereas the superscript (x s ) represents s-th sample, therefore x s i is s-th sample of i-th input. Dimensionality reduction in QoE modeling implies function QoE defined on a N-dimensional set of IFs . The special case of this mapping is linear transformation so Special cases of a linear transformation are feature selection and even weight distribution. For feature selection, weighted coefficients for the selected set of IFs {IF i } M i=1 are 1, w ii = 1 , while others weighted coefficients are 0, w ij = 0, i � = j . Even weight distribution is a model where all IFs are nearly equally important with weighted coefficients w ij ≈ 1/N for all i = 1, . . . , N, j = 1, . . . , N.
QoE as a function defined on N-dimensional space can be flat or it has negligible changes in some directions. These directions are optimal for dimensionality reduction since by neglecting these directions the slightest loss of information about the changes of QoE will be realized. The objective of linear transformation feature extraction is the determination of weighted coefficients in a way that the first MIF new vectors contain the highest changes of QoE whereas for other N − MIF new vectors QoE is flat or has negligible changes of QoE . Dimension of reduced space M can be chosen from 1 to N , 1 ≤ M ≤ N , where for M = N rotation of initial space is done without reduction. Choosing M < N , initial N dimensional input space is reduced to M dimensional space.
Function variability can be measured through the first derivative as slope or "rate of change" of a function, and through the variance/STD as a measure of how a function is spread out. Finding derivative of the function in points gives small scale behaviour of the function near these points. Variance/STD measures the variability of function from the average or mean of the function. The appropriate choice of measure of variability depends on the application 60 . Directions with the first order derivatives of function equal to zero or with variance/STD equal to zero can be used for dimensionality reduction since they determine regions where the function is flat. Figure 1 gives an overview of methods and metrics based on which measure of function variability is used. Red marked method and metrics are introduced in this paper. www.nature.com/scientificreports/ Global sensitivity analysis metric uses derivatives and variance as a measure of function variability where Derivative based global sensitivity metric uses derivatives and Sobol's total sensitivity metric uses variance as a measure of function variability. Sensitivity analysis is a study connected to feature selection since it determines how the variations in the output of a model can be apportioned to different sources of variation 61 . Sensitivity metrics are classified as local-nominal parameter value is small changed for measurement of model's response and global-measurements of importance of each variable over a range of parameters. The basic approach is varying the input parameters to a model to see how the output is affected 11 . Commonly used sensitivity metrics are above mentioned Derivative based global sensitivity metric and Sobol's total sensitivity metric 60 .
Dimensionality reduction method ASM uses derivatives as a measure of function variability. Outputs of ASM are eigenvalues and eigenvectors used to form IF new . Eigenvectors contain information of influence of particular IF on QoE per dimension, so using the ASM approach, weighted coefficients determination can be done where every reduction degree (M ≤ N) is determined with the corresponding eigenvalue M . The mean squared directional derivative of QoE with respect to the eigenvector w i is equal to the corresponding eigenvalue i 12 so for smallest i ≈ 0 , changes of QoE are zero or negligible. Weighted coefficients matrix W can be interpreted as follow: first column {w i1 } N i=1 contains first eigenvector components which represent weighted coefficients of particular IFs containing information about its influence, so reducing initial N-dimensional space to 1 -dimensional space, influence of particular IF is determined by these weighted coefficients. Particular row i of weighted coefficients matrix w ij M j=1 corresponds to particular IF and contains information of the portion of influence of particular IF per dimensions.
First row is connected to the first IF, so first element w 11 is influence of IF 1 on QoE in first dimension, w 12 influence of IF 1 on QoE in the second dimension, while w 1N is influence of IF 1 on QoE in least important dimension where QoE is nearly flat.
Determining gradients when using ASM can be a challenge since the explicit functional dependence of QoE in terms of IFs is mostly not known. Usually, approximation needs to be used, commonly finite differences method. Also, gradients may not be used when variables are categorical or discrete. So, methods that overcome these problems are of importance. We introduce mASM as a dimensionality reduction method which uses variance/STD as a measure of variability which completely overcomes the problem of finding gradients and it is applicable to a wider range of input IFs. The use of variance in QoE modeling to describe the relationship of independent inputs and QoE exists through statistical analysis of ANalysis Of VAriance (ANOVA) 62-66 whereby in this paper variance/STD is used for dimensionality reduction as input to SVD analogously as gradients are used as input to SVD in the case of ASM. ANOVA is also used in dimensionality reduction, but as a criterion for feature selection 67 . mASM differs from PCA and its supervised modification 68 where variance is calculated over inputs while in the mASM variance/STD is calculated over the function of inputs. Modification of ASM is also given in 69 where modification includes usage of average of gradients which does not overcome the issue of applicability on categorical variables and it requires calculation or approximation of gradients which may be difficult or inadequate for some IFs.
Feature selection metric Activity scores for feature selection (ASFS) is a sensitivity metric obtained from the ASM procedure which quantifies how much each IF describes change of QoE . ASFS is comparable with the Derivative based global sensitivity metric since both are based on derivatives and according to 60 ASFS is equal to the Derivative based global sensitivity metric when M = N . Analogously to ASFS, in this paper, we introduce Variance/STD scores for feature selection (VSFS) as an output from mASM and compare it with global Sobol's total sensitivity metric.
Feature extraction metric Activity scores for feature extraction (ASFE) is introduced in this paper which quantifies how much weighted combination of IFs describes changes of QoE . Using ASFE and ASFS it is possible to compare feature extraction and feature selection approach for every reduction degree. Analogously to ASFE, in this paper, we introduce Variance/STD scores for feature extraction (VSFE) as an output from mASM with the possibility to compare it with VSFS.

Overview of existing method and metrics
According to Fig. 1, in this chapter overview of existing methods and metrics is given. Since determining gradients is not a problem only for QoE modeling, and also discrete and categorical inputs are not present in IFs space only, therefore we will use general notation in the sequel.
Global sensitivity analysis metric: Sobol' total sensitivity metric. Sobol' total sensitivity analysis is derived from the functional ANOVA decomposition or the variance-based decomposition. Let i ∈ {1, . . . , N} and S i is a set of subsets containing the index i . ANOVA decomposition of function is f (x) = u⊆{1,...,N} f u (x) . Sensitivity metric is total effect index 60,70 : Jansen's formula for approximation of τ i is: where σ 2 f is variance of points of the N p evaluation of function f in f (A) which approximates the variance of f .
Global sensitivity analysis metric: derivative based global sensitivity metric. Derivative based global sensitivity analysis is based on output changes to small variety in models inputs through derivative analysis. This metric can be expressed as 60,71 Monte Carlo estimation of v i for N p points is Dimensionality reduction method: active subspaces method. In general, ASM is a feature extraction method where each direction is determined by a set of weighted coefficients that defines a linear combination of all inputs. Reduction is based on an estimation whether the function prediction changes as the inputs move along a particular direction. Direction can be ignored in the parameter study if there is no change along it or the change is negligible. The assumptions for the application of ASM are as follows: simulation model is with N defined inputs and measured scalar quantity of interest, ranges are specified for each of the independent

Figure 2.
Overview of derivative and variance/STD based methods and metrics for feature selection and feature extraction. where the real constant α of is an oversampling factor that is usually chosen in the range between 2 and 10 12 , M ≤ N is the number of eigenvalues to be used in the model after reduction. Eigenvalues are used to determine the size of the active subspace, based on gaps between eigenvalues, whereas the corresponding eigenvectors define the active subspace. The theory behind active subspaces begins with a matrix C which is defined as where f is the quantity of interest in a given computational model, the gradient of f is taken in accordance with the model parameters, and ρ is the probability density function. Input column vector is.
Since C is a symmetric matrix, eigendecomposition is possible and given by where is and W is an orthogonal matrix whose columns are orthonormal eigenvectors w i , i = 1, . . . , N that correspond to 1 , . . . , N , respectively. When eigendecomposition is performed, it is possible to separate eigenvalues and eigenvectors in the following way where 1 contains "large" eigenvalues, and 2 "small" eigenvalues, W 1 contains eigenvectors assigned to "large" eigenvalues, and W 2 contains eigenvectors assigned to "small" eigenvalues. Active subspace is obtained from gradients ∇f (x) and determination of the active subspace requires the ability to calculate gradients or gradient approximations at any point x from the domain under consideration. In the case where the gradients are unknown, and the simulation is manageable, it is possible to use the approximate values of gradients through the method of finite differences. In this case, the required number of simulations is Approximation of eigenvalues and eigenvectors of the matrix C defined by (6) can be done using a random sampling algorithm. where singular values are the square roots of the eigenvalues, and W are the eigenvectors. www.nature.com/scientificreports/ Feature selection metric: activity scores for feature selection. ASFS is a sensitivity metric from the eigenpairs according to 60 and it is used to rank the importance of inputs, so ASFS can be expressed as: where j is j-th eigenvalue and w j = [w 1j , w 2j , . . . , w Nj ] is j-th eigenvector. According to this metric, the importance of particular input can be expressed through how changes of particular input changes function on average. ASFS interpretation 60 explains that scaling each eigenvector by its corresponding eigenvalue is reasonable for global sensitivity metric construction. Eigenvector w 1 identifies the most important direction in the parameter space in the following sense: perturbing input along w 1 changes f more, than perturbing input orthogonal to w 1 60 . The components of w 1 measure the relative change in each component of input along this most important direction, so they impart significance to each component of input. The second most important direction is the eigenvector w 2 and relative importance of w 2 is measured by the difference between eigenvalues 1 and 2 60 . ASFS (15) are bounded above by the Derivative based global sensitivity metric given by (3)  where N s is the number of samples in the group determined by x i . Sum of squares for the categorical variable is given by 73 .
where p(δ) is the probability that f i (x i ) takes the value δ.

Introduction of new method and metrics
According to Fig. 1, in this chapter mathematical background of new methods and metrics is given and appropriate notation is introduced.
Dimensionality reduction method: modified ASM. In this paper, we propose a modification of ASM where variance/STD is used as a measure of the function variability. STD and variance operate on one-dimensional space and can be calculated for each dimension independently of the other dimensions.   (18) or (23) The eigendecomposition of C given in (6) and (26)  Feature selection metric: variance scores for feature selection. Variance scores for feature selection VSFS is a sensitivity metric introduced analogously to ASFS.
(29) N sim = α of M(N s N + 1)logN. Comparison of ASM and mASM is given in Table 2.

Feature extraction metric: activity scores for feature extraction. A new metric for feature extrac-
tion is introduced in order to be able to compare the result for each reduction degree M ≤ N of feature extraction and feature selection approach. Analogously as in the case of ASFS, scaling each eigenvector by its corresponding eigenvalue is a base for specification of Activity scores for feature extraction ASFE of dimension M.

Definition 3
Let j and w ij be defined as in (15), Activity scores for feature extraction of reduction degree M, 1 ≤ M ≤ N is Important properties of this metric are proven in the following theorems. ASFS α FSi for specified M ≤ N is column vector {α FSi } N i=1 containing information about the importance of all inputs i . ASFE α FE M for specified M ≤ N is a scalar containing information about the importance of the linear combination of all inputs i.

Theorem 1 The Activity scores for feature extraction α FE M , correspond to sum of eigenvalues
Since W is an orthogonal matrix whose columns are normalized eigenvectors, N i=1 w ij 2 = 1 , so α FE M = M j=1 j , as required. which can be written equivalently as.

Dimensionality reduction method ASM mASM
Input column vector i.e. we need to prove.
or equivalently.
so, it is left to show. According to (36) it can be concluded that in QoE modeling linear transformation of the input set of IFs by weighted coefficients determined by ASM, reduced space specified by IF new will always contain the same or more information about changes of QoE than the selection of any set of input IFs. Using properties proven in Theorem 3 it is possible to specify the relative ratio which expresses a ratio between function variability determined by selected reduction degree and selected approach, and cumulative function variability described by (37). (34) and (15) respectively, relative ratio for feature extraction is whereas for feature selection is Feature extraction metric: variance/STD scores for feature extraction. Analogously to ASFE, Variance/STD scores for feature extraction VSFE can be defined as:

Definition 4 Let the α FE M and α FS i (M) be defined as in
Definition 5 Let the j and w ij be defined as in (33), Variance/STD scores for feature extraction VSFE is

Numerical analysis
Based on the above-mentioned mathematical introduction of method and metrics, in this chapter numerical analysis is performed which includes multiple simulations in order to obtain QoE for IFs analysis. The following tools are used. MATLAB 75 is used as a tool for random selection of values of input IFs, implementation of the ASM and mASM, calculation of metrics, neural network modelling, and data analysis. The video sequence is coded using the ffmpeg tool 76 , video transmission simulation is performed in the NS3 simulator 77 using the EvalVid evaluation tool 78 for QoE metric estimation. The video sequence is widely used sequence Akiyo (352 × 288 resolution with 30 fps each 10 s long) and can be accessed from 79 . Objective measurement of the QoE is made for selected input points. MOS tool is based on MOS calculation of every single frame of the received video and its comparison to the MOS of every single frame of the original video. MOS determination for all input points, approximated gradients and samples is used as input to form input column vector for ASM and mASM. In this paper, two sets of IFs are analysed. Set 1 includes IFs to which both ASM and mASM methods can be applied. Set 2 includes IFs from Set 1 and an additional IF which is a categorical variable. For Set 2, the mASM method is applied since the calculation of gradients for categorical variables is not possible and therefore the application of ASM method is not possible. Overview of the input IFs is given in Table 3.

Eigenvalues evaluation, ASFS/VSFS estimation, and performance analysis. Selected values of
input IFs uniquely determine 10-dimensional point as input for Set 1 and 11-dimensional point as input for Set 2. Input points are randomly selected by random selection of all input IFs. The required number of points is determined according to (5), so simulation results are obtained for 40/60/80 input points. A comparison of ASM and mASM methods is performed for Set 1. The results of applying the mASM method to Set 2 are also given below. According to ASM and mASM procedure, input column vector of gradients (see (7)) is used to approximate eigenvalues and eigenvectors for ASM, whereas input column vector of variances [see (24)] is used in case of mASM (Fig. 3). Figure 3c shows a gap between 1 and 2 implying possibility to reduce dimension to one for both ASM and all N s for mASM for N p = 80 . Gap in eigenvalues indicates the separation between the active and inactive subspace, and computed eigenvectors are more accurate when there is a significant gap between eigenvalues. The values of eigenvalues show that the change in function in 10th dimension is negligible. Comparing ASM and mASM, different gaps in these methods are the result of different measures of function variability. Although the larger number of input points gives better accuracy of the prediction, a similar gap explanation can be done for eigenvalues for 40/60 input points (Fig. 3a,b). Figure 3d gives an overview of eigenvalues for Set 2 for 11 IFs only for mASM, where the separation between active and inactive subspace can also be observed.
Dominant changes exist in the first dimension. The magnitude of the components of the approximated eigenvector corresponding to 1 is given in Fig. 4. For ASM and mASM, weighted coefficients for N p = 40 have approximately the same values as for N p = 60 and N p = 80 , so it is not expected that further increase in the number of points will lead to significant changes in values of weighted coefficients. For Set 2, it is important to note that the addition of a new IF leads to new weighted coefficients, whose values depend on the particular IF influence on the QoE metric.
ASFS α FS and VSFS β FS are used to determine importance of a model input IFs [see (15) and (33)]. Figure 5a-c present ASFS and VSFS for all input IFs. Similar results are obtained for different number of input points. www.nature.com/scientificreports/ According to ASFS and VSFS, the dominant IF is CRF. After CRF, the next important IF is FPS for ASM, and distance for mASM. It can be concluded that small variation in these IFs, changes the QoE metric more than the small variation of other IFs. Accuracy of the prediction increases with an increase of input points so sensitivity analysis given for N p = 80 will be used for comparison with global metrics. Comparison of ASFS and VSFS for N p = 80 and for N s = 5 (VSFS) gives a different order of importance. For both methods the tenth parameter is MaxSsrc with the least influence on QoE metric. The different order is due to different metrics that measure the function variability and also the observed different order is also due to the fact that the parameters have  By ignoring categorical IFs, the most influential IF is neglected, confirming that as many IFs as possible need to be considered for an accurate QoE estimate. The accuracy of dimensionality reduction methods is tested using additional simulation measurements. In this phase, data modelling is done to a rotated and reduced set of input IFs using Neural Network according to 80 . According to 81 , performance evaluation is done using the following evaluation indexes: • Pearson Correlation Coefficient (PCC) • Root Mean Square Error (RMSE) • Mean Absolute Error (MAE) • And Root relative squared error (RRSE) In Fig. 6 it can be seen that the accuracy of mASM methods with 11 IFs is significantly higher than mASM with 10 IFs and ASM with 10 IFs. Adding new IFs in QoE estimation increases prediction accuracy. The accuracy of prediction for the ASM and mASM method with 10 IFs is approximately the same, and through the accuracy of prediction it can be seen once again that neglecting the most important IF significantly reduces the accuracy of prediction.

Comparison with global metrics. Comparison of ASFS and VSFS with global metrics is done for Set 1
where N p = 80 is used for ASFS and it is compared to Derivative based global sensitivity metric [see (4)]. Sensitivity analysis given for N p = 80 and N s = 1 for VSFS is compared with Sobol's total sensitivity metric [see (2)] where N s = 1 . The same samples are used for the calculation of VSFS and Sobol's total sensitivity metric. According to (17), ASFS is equal to Derivative based global sensitivity metric for M = N as can be seen in Table 4. Analogously to ASFS, VSFS is equal to Sobol's total sensitivity metric multiplied by approximated variance σ 2 f for M = N (Table 5). It can be concluded that besides the fact that ASFS is consistent with rankings produced by Derivative based global sensitivity metrics 60 , VSFS is also consistent with rankings produced by Sobol's total sensitivity metric. For nicely behaved functions all metrics are consistent which is a common case for practical models. As we have shown, this is the case in our example for QoE metric where the most important IFs are recognized as the most important for all metrics as well as the unimportant IFs.  Table 6 Activity scores for all α FE are compared with largest α FS for Set 1 with the indicated percentage of how much variability QoE metric is described for the considered reduction degree. It can be concluded that feature extracted in the first dimension describes 83.6% of the change of QoE metric [see (38)], whereas the preferred IF for feature selection is CRF which describes 50.6% change of QoE metric [see (39)]. Thus, by introducing a new variable as a linear combination of all input IFs, a one-dimensional space is obtained which describes the largest QoE metric fluctuations, and ignoring other dimensions means ignoring smaller QoE metric oscillations. In contrast, if we choose CRF, all variations of the QoE metric caused by the change of other IFs will be ignored thus neglecting larger QoE metric oscillations. For M = N = 10 only rotation of initial space is done, so there is no loss and both metrics show that 100% of function variability is described. Selection of type of dimensionality reduction feature selection or feature extraction is always in a favour of feature extraction. If the difference is negligibly small for the selected reduction degree, due to less budgetary complexity, feature selection can be considered as a better choice which would mean that there is a dominant input IF/IFs and the others are negligible or their variations do not change QoE metric at all. In Table 7 Variance/STD scores for all β FE are compared with largest β FS . Analogously to ASM, Variance/STD scores confirm that the change in QoE metric for the same reduction degree is better described by the linear combination of all IFs compared to the selection of any set of input IFs. A linear combination of input IFs can give a better overview of the change of QoE metric for both the Activity scores and the Variance scores.   Table 6. Comparison of activity scores for feature extraction and activity scores for feature selection.

Discussion
Multidimensional QoE analysis has become imperative to improve the QoE modeling process. The curse of dimensionality is a term that refers to all problems connected with high dimension of data which are surpassed at lower dimensions. High-dimensional data set may contain many features that are all measurements of the same underlying cause, so they are closely related where the features of such data set contain much overlapping information 82 . So according to the posed challenge of achieving dimensionality reduction of the input set of independent IFs in QoE modeling, this study proposes ASM based on derivatives and introduces modification of ASM based on variance/STD as a measure of function variability. The appropriate choice of measure of variability depends on the application. The advantage of mASM in QoE modeling is the possibility to use categorical variables with no need for calculating gradients which is difficult or inadequate for some IFs. Since functional dependency between input IFs and QoE is mostly not known, the use of finite difference methods allows approximation of gradients which introduces an error at the input to the method as opposed to the use of a variance. We also observed that the QoE function mostly does not change much at smaller shifts by dimensions resulting in gradients having a value of 0 in all dimensions. This information is interpreted in the method as all inputs are equally important, although changing a specific input may not change the QoE at all. This results in less important inputs having higher weighted coefficients which is also an additional disadvantage when using gradients. The disadvantage of using variance is the need for more simulations for the calculation. Besides the modification of the existing method, this paper introduces new metrics for the comparison of feature selection and feature extraction approaches. Application of dimensionality reduction before QoE prediction can provide models for devices with different processors and memory power with varying degrees of complexity and accuracy. Reducing the dimensionality of the input data set can speed up the training process of machine learning algorithms used for QoE prediction. The use of machine learning and large amounts of data in QoE assessment is part of the strategy for developing big data-driven intelligent networks. The development of AI is based on machine learning of big data collected through multiple spots in current and future networks who need to intelligently adjust to the environment, while maintaining quality at a satisfactory level. The estimated QoE can be used as an input to achieve spectral efficiency, energy efficiency, cost reduction, etc. Variations in the quantity of interest using ASM and mASM as dimensionality reduction methods can link resources in the network for optimal resource reservation and architecture design with delay as the quantity of interest and a measure of quality for latency modeling purposes. Quantity of interest could also be power consumption, which is particularly important in implementing solutions that include limited battery life, such as sensors in the IoT networks.
Obtained knowledge in this study can help interested stakeholders including mobile network operators, technology developers, software solution providers and the research community to improve QoE input data by inclusion data preparation phase in order to achieve optimal trade-off between complexity and accuracy, thus optimizing the overall process depends on the specific application. Mobile network operators experience increasing user requirements in the context of quality, which becomes a challenge with the ever-increasing demands of multimedia applications with limited resources. Innovation processes and end user roles are strongly connected, so technology developers cannot simply separate user experience with technology. Optimized QoE inputs that have varying degrees of complexity offer the possibility of applicability across different technologies and prediction models. Software solution providers can improve their algorithms design to meet QoE requirements with appropriate QoE inputs. Academic and research communities can use knowledge to further improve dimensionality reduction methods and QoE prediction models.
This study has several contributions and implications. Firstly, the original contribution of this study is the first attempt, to our best knowledge, to overview the previous applications of dimensionality reduction to the input set of IFs in QoE modeling. This overview contains the applied methods for dimensionality reduction with the achieved degree of dimensionality reduction. This can serve as a basis for ideas to introduce new dimensionality reduction techniques in combination with different algorithms for QoE prediction for different purposes.
Secondly, ASM and mASM as dimensionality reduction techniques used in this study differ in preconditions and outputs from applied dimensionality reduction techniques listed in the overview of related works. A special contribution of the paper is the modification of the ASM method which is not limited to use only in QoE modeling but can be used to reduce the dimensionality of input spaces where categorical or discrete variables are used, and for relatively flat functions. Since the IF input space and the QoE function satisfy both conditions, the preferred method for dimensionality reduction of the independent set of input IFs is mASM. Thirdly, the metrics used and introduced in the paper enable the comparison of different approaches and provide information on QoE changes for different spaces, which is the addition in to the analysis of multidimensional QoE. Understanding the dataset with its strengths and weaknesses is crucial for good QoE prediction since each model learns differently, and what is an advantage in one model may be the weakness of another. Simpler models are generally easier to Table 7. Comparison of variance/STD scores for feature extraction and variance scores for feature selection.  www.nature.com/scientificreports/ control and it is possible to more easily identify the reasons for inaccuracies, where the metrics introduced in the paper provide information on the loss of accuracy as a consequence of model simplification.
This study also provides a couple of implications. There are important theoretical implications that show that usage of dimensionality reduction is justified in the QoE preparation phase. Our findings extend the previous work with new method applied on different input set with additional outputs, thus complementing and opening up new possibilities for the application of dimensionality reduction techniques in QoE preparation phase. Practical implications of ASM and mASM usage as dimensionality reduction methods are connected with the requisite for manipulating multiple independent inputs and knowing the change of QoE for variations of all independent inputs. Beside huge volume, data generated within the network also have non-homogenous structure, where input information is incomplete and ambiguous. In addition to the theoretical and practical implications, this study also reveals implications for future research. As the study is limited to 11 input IFs, a future study should extend the above set to give a better picture of the impact of different SIFs, as well as CIFs and HIFs on QoE especially for mASM. Comparison of different QoE prediction techniques in combination with different dimensionality reduction methods would allow a more comprehensive review and performance of the various prediction models will be the focus of future work. In addition, the application of ASM and mASM is not limited to QoE as the quantity of interest so, for example, latency and/or power consumption can be the focus of future work.

Conclusion
The comprehensive understanding of QoE change requires the analysis of as many IFs as possible which dictates the introduction of tools that can handle spaces with large dimensions and large amounts of data. This is the motivation for this study whose objective is the optimal description of IFs input space depending on the change of QoE. In this regard, a review of related works was made regarding the used dimensionality reduction techniques and an overview of existing techniques suitable for an independent set of input IFs. Modification of method and new metrics are introduced for more comprehensive analysis of IFs and QoE space. The optimal dimensionality reduction approach is feature extraction whereas the optimal reduction degree is the trade-off between the accuracy and complexity.
According to above mentioned, the original contribution of this study follows the objectives and can be summarized as follow: -Meta-analytical overview of used dimensionality reduction techniques applied to the input set of IFs thus creating a basis for extending the methods used in QoE modeling. LDA, LVQ, and ASM are recognized as methods applicable to an independent set of input IFs, where the advantage of ASM is that it is not limited to classification only; -Introduction of modification of ASM with variance/STD as a measure of function variability thus overcoming the problem of gradient calculation and creating the possibility of applying the method to discrete and categorical IFs. Modified ASM is not limited to application in QoE modeling; -Introduction of new metrics ASFE, VSFS, VSFE, and relative ratio R(%) which allows comparison of feature selection and feature extraction approach. It is proved that linear combination of input IFs where weighted coefficients are determined using ASM or mASM method is always a better choice than the selection of any IF or any combination of IFs for given reduction degree; -Application of ASM and mASM on the selected set of input IFs with an objective evaluation of QoE with the comparison of ASFS and VSFS with global metrics, and comparison of ASFS and ASFE, then VSFS and VSFE metrics for different reduction degrees. Numerical analysis for selected IFs showed that the QoE function is suitable for dimensionality reduction with arbitrary flat directions. It has also been observed that smaller shifts in IFs do not alter QoE much which is another advantage of introducing variance/STD as a measure of function variability in mASM. Performance analysis has shown that the use of mASM method achieves greater accuracy compared to ASM, when the input data set contains categorical variables.