
In the online shopping field, AI technology involves several key technologies synergistically employed to enrich user shopping experiences, facilitate personalized recommendations, and fortify security measures. NLP plays a crucial role by parsing user voice commands, text reviews, or questions, enabling smart devices to discern user needs more adeptly and furnish more intelligent services. ML is widely applied in personalized recommendation systems, accurately predicting user interests by scrutinizing shopping history, behavior, and preferences, thereby augmenting shopping satisfaction. Image recognition and computer vision technology empower systems to process product images, identify and classify products, and provide relevant information, thus offering users a more convenient and efficient shopping experience. Recommendation systems analyze user shopping behavior and historical data via algorithms, furnishing personalized product recommendations and augmenting the discovery of new products, thereby increasing the personalization of the shopping experience. Internet security technology is pivotal in safeguarding user privacy and transaction security, enhancing payment and transaction security through authentication, data encryption, and risk identification mechanisms. The application of these key technologies collectively drives the development of AI in online shopping, providing users with a more convenient, personalized, and secure shopping environment.
Emotion-based image retrieval technology is an approach that amalgamates computer vision and sentiment analysis to identify and understand emotional expressions in images. In evaluating shopping satisfaction, this technology accurately captures users’ emotional experiences during the shopping process. This technology relies on extracting image features and abstracting emotional elements in the image into quantifiable data, encompassing features pertinent to color, expressions, and scenes. Through effective feature extraction, the system can better understand the emotions expressed by users while shopping. Figure 1 elucidates the general architecture of emotion-based image retrieval.
To comprehensively evaluate user satisfaction, emotional image retrieval also involves semantic associations and multimodal fusion. This entails amalgamating the emotional information from images with other user feedback forms, such as text comments or shopping behavior data, to garner more accurate and comprehensive evaluations. Leveraging evaluations from user emotional images, personalized recommendation systems can better understand user preferences. This facilitates the system in recommending products that align more closely with users’ emotional needs, thus amplifying the personalization of the shopping experience.
Figure 2 displays the topology of the RNN. Its essence lies in its recurrent structure, enabling it to retain memory for sequential information. At each time step, the RNN receives the current input and the hidden state from the previous time step, updates the hidden state by learning weights, and transmits it to the subsequent time step. Inputs to the RNN typically encompass features of the current time step, such as user behavior data and shopping history. The presence of the recurrent layer allows the network to receive and process sequential data, such as time series, natural language text, or other data with temporal correlations. At each time step, the recurrent layer ingests the current input and the hidden state from the previous time step, generating a new hidden state. This information transmission and memory mechanism facilitates the network in capturing temporal dependencies in the sequence. The output layer typically produces the model’s predictions for user satisfaction.
RNN is a suitable framework for modeling sequential data due to its capability to capture temporal relationships within sequences. In the context of shopping scenarios, user shopping behavior typically manifests as a time series. By modeling this sequence, RNN can adeptly comprehend the dynamic evolution of user behavior.
RNN takes an input vector sequence and extracts another vector sequence to represent the hidden layer state. The expression of the hidden layer state reads:
The output sequence is represented through the hidden layer state. can be written as:
and e indicate the weighted sum corresponding to the pre-activation. and refer to the activation function.
During each time step of the training process, RNN receives the current input and the hidden state from the previous step. It calculates the output for the current time step based on the current input and hidden state. Subsequently, the output of the current time step serves as the input for the next time step. This iterative process continues until the entire sequence is processed.
In online shopping scenarios, user reviews not only include direct evaluations of product quality but also reflect their usage experiences, emotional tendencies, and potential needs. Relying solely on behavioral data, such as purchase frequency and return rates, may not fully capture user satisfaction. Deeper emotional insights and key themes can be extracted from review texts by incorporating NLP techniques, thus providing a richer basis for satisfaction assessment. Sentiment analysis of online review information is a method that utilizes NLP and ML techniques to analyze comments posted by users on the internet. This process aims to determine the sentiment conveyed in the comments. Such analysis furnishes valuable insights for reputation management, market analysis, and improvement for businesses, brands, and products.
Figure 3 depicts the RNN topology structure based on behavioral sequence data. The input at time t, denoted as , is a -dimensional vector, and the weights from the input layer to the hidden layer are represented by a -dimensional matrix K. The backpropagation algorithm of the RNN mainly employs the iterative approach of gradient descent. Ultimately, it determines appropriate weight matrices U, W, V, and K, and bias variable b is used to fit the RNN. The gradient updates for various parameters of the RNN model for predicting online shopping user behavior can be expressed as follows:
Traditional RNNs encounter challenges in training long sequences and suffer from the vanishing gradient problem. To mitigate these challenges, the GRU is introduced. GRU comprises two primary gating units: the update and reset gates. These gates allow the GRU to selectively remember or forget previous states.
On the one hand, the update gate regulates the retention of information from the current state, with its output ranging from 0 to 1. Values closer to 1 signify the retention of more information from the prior state. On the other hand, the reset gate governs the amalgamation of information from the previous state with the present input, thereby determining the extent to which the previous state’s information should be “forgotten.” Through these gate control mechanisms, GRU enhances the model’s ability to learn and retain long-term dependencies in sequence data while mitigating the vanishing gradient problem.
By alleviating the vanishing gradient problem through gate mechanisms, it may still be challenging to fully capture the dynamic changes in user satisfaction when dealing with complex online shopping behaviors. Therefore, a dynamic weighted – GRU (DW-GRU) is proposed, which introduces a dynamic weighting mechanism based on GRU to adapt to the dynamic changes in online shopping satisfaction. The algorithm design process is as follows:
Input Layer: This layer receives the user’s shopping behavior sequence data, including browsing, purchasing, reviewing, etc., along with corresponding timestamp information. After preprocessing, these data form the input sequence of the model.
Embedding Layer: Each behavior in the input sequence is transformed into a fixed-dimensional embedding vector to capture the semantic relationships between behaviors.
Dynamic weighting mechanism: Based on GRU, a dynamic weighting calculation module is introduced. This module computes a dynamic weight value based on the user’s behavior and timestamp information at the current time step. This weight value adjusts the outputs of the update gate and reset gate in GRU. It enables the model to dynamically regulate the transmission and memory of information based on user behavior at different time steps.
GRU: Based on the dynamic weighting mechanism, GRU units are constructed to handle sequence data. The update gate and reset gate in GRU are computed using the current input and the hidden state from the previous time step, with their outputs further refined by the dynamic weight value. The update gate governs the retention of existing information, whereas the reset gate regulates the creation of new information.
Output Layer: According to the output of GRU, the output layer predicts the user’s shopping satisfaction at the current time step. The softmax function transforms the output into a probability distribution for multi-classification tasks.
In the DW-GRU model, at each time step t, the dynamic weights are calculated based on the user’s behavior and timestamp information. The weight is calculated as follows:
represents the sigmoid activation function; W refers to the weight matrix; h denotes the hidden state of the previous time step; x means the input for the current time step; b stands for the offset term. The value of dynamic weight w is between 0 and 1, which is used to adjust the output of the update and reset gates.
In the GRU, the update gate z and the reset gate r are calculated as follows, with their outputs being modulated by dynamic weights:
The adjusted gating signal reads:
z and r represent the output of the update and reset gates of the previous time step.
The DW-GRU model can output the probability distribution of a user’s shopping satisfaction at the current time step. These outputs are converted into easily understandable scores or labels to make this information practical for stakeholders or business users. Satisfaction is categorized into several levels (e.g., very satisfied, satisfied, neutral, dissatisfied, very dissatisfied) based on the highest probability value, and these levels are directly presented to e-commerce platform managers. This allows for the optimization of product recommendations and service improvements based on user satisfaction feedback.
During the model training, the weights of various features are dynamically adjusted based on the actual performance of the data or the model’s predicted results. This adjustment can be based on multiple strategies, such as the feature’s contribution, importance, or correlation. After the weight adjustments, the model recalculates the prediction results according to the new weights and updates the model parameters to better fit the data in subsequent training.
The following methods can achieve a dynamic weighting mechanism and the integration of neural networks. (1) Feature weighting: In the input layer of the neural network, features are weighted to ensure that the model focuses more on important features during training; (2) Inter-layer weighting: A dynamic weighting mechanism can also be introduced between layers of the neural network, adjusting the input weights of the next layer based on the output of the previous layer; (3) Loss function weighting: A dynamic weighting term can be incorporated into the neural network’s loss function, allowing the model to pay more attention to specific loss terms or objective functions during optimization.
Emotion image retrieval technology combines computer vision and emotion analysis to capture users’ emotional experiences more accurately during shopping. This technology is significant for comprehensively evaluating user satisfaction, considering the emotional tendencies expressed in textual reviews and the emotional expressions conveyed in images. Emotion image retrieval technology identifies and understands the emotional expressions within images by extracting image features and abstracting emotional elements into quantifiable data, including features related to color, facial expressions, and scenes. Through effective feature extraction, the system can better comprehend the emotions expressed by users while shopping.

