Functional Requirements and Object-Oriented System Modeling for Designing AI-Driven Intelligent Catering Systems

Kaihong Feng; Fang Zhao; Qinkai Yang

doi:10.3791/69360

Research Article

Functional Requirements and Object-Oriented System Modeling for Designing AI-Driven Intelligent Catering Systems

DOI:

10.3791/69360

⸱

October 31st, 2025

Kaihong Feng¹ , Fang Zhao² , Qinkai Yang³

¹Faculty of Information Science and Technology, The National University of Malaysia, ²College of Mathematics Science, Inner Mongolia Minzu University, ³Faculty of Information Science and Technology, The National University of Malaysia

Summary

$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

This study introduces an AI-based restaurant catering system that allows for contactless communication, customized meal suggestions, and satisfaction prediction. By utilizing NLP with LDA, Conv-RNN, and Conv-LSTM, it surpasses rule-based techniques with more accuracy, precision, recall, and reduced mistake rates, demonstrating AI's revolutionary potential in the food service industry.

Abstract

$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

The food industry has undergone a significant transformation in recent decades due to globalization, technological advancements, and evolving customer expectations. Artificial Intelligence (AI) and the Internet of Things (IoT) are now playing a critical role in enhancing food production, marketing, and service delivery. This study proposes an AI-driven intelligent system to improve restaurant catering services through contactless service using Natural Language Processing(NLP) and Linear Discriminant Analysis(LDA), personalized food recommendations through a Convolutional Recurrent Neural Network(Conv-RNN) model, and customer satisfaction prediction using an optimized Convolutional Long Short Term Memory(Conv-LSTM) model. Real-world experiments demonstrate that the proposed system outperforms traditional rule-based methods, achieving 91.5% accuracy, 91% precision, 91.1% recall, and an F1 score of 89.7% with Word2Vec-LDA; 98.5% accuracy with a loss of 0.02 in the Conv-RNN model; and an RMSE of 0.1011 with an R² of 0.9812 in the Conv-LSTM system. These results highlight the transformative potential of AI in automating and enhancing customer service in the restaurant industry.

Introduction

$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

Adoption of AI has been a crucial part of digital technology growth for the last decade. It has given several industries, including the hospitality sector, both possibilities and challenges since its start¹, and numerous AI-powered inventions have been developed that have the potential to improve people's quality of life and thereby enhance the economy. In the very competitive restaurant industry, maintaining top-notch food and customer service is essential to success. As technology advances and dining experiences shift, AI is becoming a game-changing tool to increase operational effectiveness and customer satisfaction. AI-powered monitoring systems are transforming restaurant operations² to better manage their kitchens, keep an eye on food quality, and deliver top-notch customer service. Through the use of advanced algorithms and real-time data analytics, these technologies streamline operations and guarantee consistency, safety, and excellence in every aspect of the dining experience. It is now possible for restaurants to achieve a higher degree of precision for normal operating procedures².

Overall financial success, adaptability to changing circumstances, and the ability to expand and change its offers to meet customer needs and expectations are all factors in the tourist and hospitality industry, and these factors frequently determine whether a business survives³. Therefore, the tourism and hospitality sector is using advanced technologies like AI and robotics (AIR) to enhance client service and experience. These technological advancements are being used as intelligent tools for customer care in order to improve client experience⁴. Furthermore, corporate performance may be enhanced by the rapid advancement of AI in hospitality management. An example of a data-intensive industry that collects vast amounts of data in various formats is the hotel sector.

Inefficiencies in operational management, growing consumer expectations for individualized services, growing labor shortages, and the requirement for precise demand forecasts are some of the ongoing issues facing the hospitality industry. Conventional approaches frequently fail to adequately handle these problems, which raises operating expenses and results in uneven service quality. By automating tedious processes, facilitating data-driven decision-making, boosting demand prediction, optimizing pricing and inventory management, and enhancing consumer personalization, artificial intelligence (AI) provides solutions. AI is becoming more and more positioned as a game-changing instrument for enhancing the hospitality sector's operational effectiveness and guest experience by filling in these gaps.

By restoring operational effectiveness and recognizing client events, AI is transforming the history of customer care in the restaurant industry². AI skills like machine learning and extrapolative analytics are crucial for streamlining procedures like demand creation and inventory management. Better service consistency and lower operating costs are the results of these advancements⁵^,⁶. Additionally, by evaluating consumer data to offer tailored menu suggestions and elevations, AI facilitates personalized interactions and promotes customer loyalty and happiness⁷. Intelligent systems play a key role in the travel and tourism sector, increasing competition by concentrating on sophistication⁸. Hotels are experimenting with cutting-edge technology, including digital strategies⁹ and robots enhanced with AI and the Internet of Things (IoT). Robots powered by AI and related technologies are becoming more and morecommon¹⁰. Through digital interfaces, voice assistants (VAs), which are artificial intelligence (AI) devices activated by voice commands, provide a degree of human-like intelligence¹¹. VAs have difficulties despite their advancements, such as low user awareness, annoyance, and occasional opposition from both hotel employees and visitors¹².

Delivery and takeout operations are growing in the restaurant industry as society dynamics change and more Americans choose convenient and time-saving options¹³. Nearly 60% of consumers in the US place an order for delivery or takeout at least once a week, and 78% of consumers actively use online food ordering platforms¹⁴. This trend is further reinforced by the data that the online food delivery market reached a confounding $220 billion by the end of 2023, accounting for 40% of restaurant sales, and that the market will reach $365 billion by the end of 2030¹⁵. It is anticipated to increase to over USD 534.60 billion by 2028. With a predicted compound annual growth rate of approximately 9.5% through 2034, the global market was valued at approximately USD 288.87 billion in 2024 and is anticipated to reach USD 316.31 billion by the end of 2025¹⁵. Furthermore, the number of direct online orders has increased by an astounding 54% between 2019 and 2021, indicating that customers are increasingly choosing digital ordering over traditional dine-in options. The growing significance of text mining and data-driven decision-making in service management, illuminating new management domains and important topics such as market intelligence and social media analysis¹⁶. This research provides a compelling argument for restaurants to use chatbots to develop operational effectiveness and customer engagement. Furthermore, the research by Proenca and Soukiazis¹⁷ highlights the possibility of data-driven approaches in marketing and customer relationship management initiatives, offering insightful information on how to use chatbots to maximize customer interactions and spur company expansion.

In order to increase efficiency and choose individualized, customer-focused service, the restaurant industry is implementing Robotics, Artificial Intelligence, and Service Automation (RAISA)¹⁸. The use of RAISA in restaurants has a number of implications that should be assessed, just like any other technology change¹⁹. By lowering errors and improving overall service quality, RAISA can provide dependable, standardized service²⁰. Additionally, it can provide clients with a distinctive interactive experience. According to Kreishan¹⁸, service robots are now more sophisticated, independent, and adaptable. But they can also have an impact on the general ambiance of a restaurant as well as the social interactions between patrons, employees, and robots. It is critical to evaluate the aspects impacting technology acceptability because managers, employees, and consumers have varying perceptions of these elements, and their readiness to interact with RAISA is changing²¹.

Omni models' significant linguistic capabilities have been highlighted by their improved sentiment classification accuracy in hotel review analysis, which reached over 67% compared to 60.6% for BERT²². At the same time, recommender systems are progressively using Large Language Models (LLMs) to improve customization by managing profiles and analyzing user reviews. While hybrid systems integrate LLM outputs with graph neural network embeddings to improve recommendation accuracy in sparse data circumstances, the PURE framework uses LLMs to dynamically update user profiles based on reviews²³. Meanwhile, in session-based situations, transformer-style recommendation architectures, such as Transformers4Rec, are beating conventional RNN models²⁴.

There are still a number of unexplored or underexplored facets of intelligent catering, despite the quick developments in AI, automation, and smart technology. These challenges include managing energy efficiency and food waste through AI-driven systems, scaling personalization for a variety of customer preferences, integrating AI with traditional kitchen workflows, protecting customer privacy and security in customer profiling, and implementing intelligent catering solutions for small and medium-sized restaurants with limited funding. For intelligent catering solutions to be widely and sustainably adopted, these gaps must be filled. This creates an opportunity for more study and invention. Potential research holes in intelligent catering systems include the following:

Although AI systems are capable of making meal recommendations based on user choices, they frequently struggle to completely comprehend the subtleties of dietary requirements and personal preferences²⁵. The majority of solutions are still somewhat basic and prone to mistakes like predicting mismatches; however, some intelligent catering systems employ AI to manage inventory or estimate demand. A lot of intelligent catering systems only have one way to interface, such as touch screens or voice recognition²⁶. Many systems are still limited in their ability to fully comprehend or forecast individual preferences, even if AI agents can provide a certain level of customization based on consumer history or preferences. Basic inputs like demographic data or previous orders are frequently used by AI bots.

Recent advancements in machine learning have increasingly focused on modeling human behavior and contextual data across diverse domains. In their evaluation of data sources and methodologies for urban building occupancy profiles, Nejadshamsi et al.²⁷ emphasized the value of heterogeneous data in capturing dynamic behavioral patterns. Building on this, Nejadshamsi et al.²⁸ showed how well deep learning predicts spatial-temporal flows of human activity by putting forth a geographic-semantic context-aware commuting flow prediction model utilizing graph neural networks. Similarly, Nejadshamsi et al.²⁹ highlighted the importance of contextual cues in improving predictive performance by creating a transportation-informed framework for urban-scale occupancy and energy estimation.

Improving customer experience and increasing operational efficiency have become crucial success elements in the quickly changing restaurant industry. Many restaurants still struggle to deliver smooth, individualized, and effective services despite the increasing use of technology because they lack integrated intelligent systems. Key issues, including enhancing contactless service, providing individualized meal recommendations, and precisely forecasting client happiness in real time, are frequently overlooked by current systems²⁵. By creating an intelligent catering system that uses cutting-edge technologies to enhance restaurant operations, this project aims to close this gap. Three essential elements are integrated into the system:

Contactless Service using Latent Dirichlet Allocation (LDA) and Natural Language Processing (NLP): By leveraging LDA and NLP, effective contactless customer interactions are made possible, cutting down on waiting times and human error.

Conv-RNN-based Recommendation Systems for Food Suggestions: This enhances menu satisfaction by using a Convolutional Recurrent Neural Network (Conv-RNN) to produce dynamic, tailored food recommendations based on consumer preferences.

Predicting Customer Satisfaction with an Optimized Conv-LSTM Model: Restaurants can make data-driven improvements by using an improved Convolutional Long Short-Term Memory (Conv-LSTM) model to forecast customer happiness based on real-time data and feedback. In addition to offering contactless engagement and customized eating experiences in a more scalable and effective way, the suggested intelligent system seeks to improve customer satisfaction, streamline operations, and improve service delivery.

We contrast this work with a number of previous methods in order to place it within the present context of AI-driven food recommender systems. For instance, a transformer-based sequential conversational recommendation framework that uses self-attention processes to capture discussion dynamics was developed by Zou et al.³⁰. In order to facilitate dialogue and visual-based recommendation tasks, Gambetti and Han³¹ developed AiGen-FoodReview, a multimodal dataset that consists of matched restaurant review texts and photos. Previously, MenuAI was created by Ju et al.³² and uses transformer models to make menu item recommendations straight from textual menu graphics. Interpretability and computational efficiency are occasionally compromised by these approaches, despite their excellent skills in processing context or multimodal information. Our integrated NLP-LDA + Conv-RNN + Conv-LSTM system, on the other hand, strikes a balance between explainability, lightweight deployment, and high prediction accuracy, which makes it particularly appropriate for catering situations with limited resources.

Examining how AI-powered technology may improve hospitality standards and expedite restaurant operations through Intelligent Catering Services (ICS) is the main objective of this study. Latent Dirichlet Allocation (LDA) and NLP are combined in the suggested method to efficiently handle client inquiries contactlessly. A Conv-RNN is used to produce meal recommendations in order to provide individualized experiences, and an optimized Conv-LSTM model is used to forecast consumer satisfaction levels. By integrating these elements, the created ICS shows practical use in raising overall customer satisfaction, guaranteeing quality, and increasing service efficiency. Using performance measures on food recommendation accuracy and satisfaction prediction, experimental results verify the model's efficacy.

The proposed intelligent catering system provides a contactless service for providing food suggestions and customer satisfaction predictions. A contactless intelligent catering system is mostly focused on automation and convenience, enabling customers to interact with the system in a touchless manner while performing specific tasks like food suggestion and satisfaction prediction. Whereas an AI agent goes beyond automation by using data-driven insights to make more accurate, adaptive, and personalized decisions, enhancing customer experience and improving operational efficiency across a wider range of functions. The goal of this study is to develop an AI-driven Intelligent Catering System (ICS) that integrates NLP-LDA for contactless interactions, Conv-RNN for personalized food recommendations, and Conv-LSTM for predicting customer satisfaction. This system is useful for restaurants because it enhances operational efficiency, reduces costs, delivers consistent service, and improves customer engagement through personalization and real-time feedback.

Access restricted. Please log in or start a trial to view this content.

Protocol

$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

This study was conducted in accordance with the guidelines of the Research Ethics Committee of The National University of Malaysia (UKM) and approved under approval number UKM FST/2025-AI/023. Written informed consent was obtained from all participants prior to the collection of chatbot queries. All data were anonymized to ensure participant confidentiality and privacy

Study overview

The overview of the proposed intelligent catering system assisted with AI technologies is shown in Figure 1. As illustrated, the customer input is preprocessed with the NLP techniques such as word embeddings, lemmatization, and tokenization to extract the tags. Then, the ML model called LDA has been applied to modelling customer tags to provide contactless service to them. The food suggestion is carried out using a Conv-RNN model. Based on the flow sequence recorded from the previous customer's choices, the food is suggested to the customer intelligently. Finally, the customer satisfaction level is predicted by using an optimized Conv-LSTM model for further improvement in the services of the restaurant. The performance of the proposed AI models is evaluated under the various evaluation metrics.

Customer recommendation process; flowchart, data preprocessing, Conv-RNN, LDA modeling for predictions.
Figure 1: Proposed system model for intelligent catering services (ICS). The architecture integrates user interaction, data preprocessing, intent detection, food recommendation, and feedback mechanisms. Please click here to view a larger version of this figure.

Dataset used

To create the intelligent catering system, we gathered 283 requests from a nearby restaurant using a chatbot. These questions, which included a broad variety of client inquiries from menu details to operating hours, were manually divided into 15 different intent groups. This guarantees thorough coverage of all possible user interactions with the system. Salutations, goodbyes, appreciation, catering, hours, setting, contact information, questions about payments, today's menu, delivery alternatives, menu questions, ordering processes, special deals, bookings, beverage options, and accessibility for outdoor sitting are just a few of the specific aspects of customer inquiries that the intent classes were made to record. Table 1 shows the frequency of questions in each of the purpose groups and the classification according to the queries' thematic substance. For example, the category Contact Information had the highest queries, suggesting that patrons are very interested in finding out how to get in touch with the eatery. On the other hand, the Seating and Beverage categories got the fewest inquiries, which may indicate that there is less consumer interest in these subjects or that there is already more clarification on them.

Contactless service using NLP with LDA

The user input that starts the framework's activity is initially preprocessed to standardize the text and eliminate noise. Tokenization, stop word elimination, and lemmatization are examples of this preparation, which gets the data ready for additional analysis. Following preliminary processing, user inquiries are converted into numerical representations. Semantic linkages and contextual relevance are among the linguistic features that are captured by these representations. Several machine learning classifiers were used to train these vector representations of user queries for the classification of 16 pre-defined classes (intents/tags). The model fetches the predetermined related response and returns it to the user after accurately predicting the user's query intents.

Preprocessing

Our analysis's dependability is greatly enhanced by the preprocessed queries, which verify that the incoming data is formatted consistently and relevant for the classification procedure that follows. The intent patterns (e.g., greetings such as hi, hello, hey) were collected and preprocessed by converting text to lowercase, removing stop words, and applying tokenization using the NLTK toolkit³³. Table 2 shows the preprocessing steps followed by this study.

LDA-based tag modeling

The system classifies the tags from the user using Support Vector Regression (SVR) so that the user's greetings are recognized by the system. In order to improve the system capacity to anticipate user intent, we thoroughly examined both conventional and cutting-edge text processing methods in addition to Machine Learning (ML) and Deep Learning (DL) models. Building a highly accurate algorithm that could comprehend a broad range of user questions was our aim. This study employed the fundamental strategies of Bag of Words (BoW) and TF-IDF because of their ease of use and potency in emphasizing word frequency and the importance of words in the text. Glove and Word2Vec's capacity to produce word embeddings according to word usage together allowed them to deliver knowledge of word meanings and context²².

Once the data is prepared, the intents queries are classified using the latent Dirichlet allocation (LDA) method. The goal is to use text mining with the LDA method to analyze the connections among terms and identify patterns in their structures³⁴^,³⁵. Unsupervised and probabilistic in nature, LDA makes the assumption that every document in a corpus is composed of a predetermined number of manually defined themes. Every document in LDA has equal weight and has a bag of words. Each document's words are presumed to be unordered. A topic is also described using a probability mass function of words. Every document uses a probability mass function to choose themes. For intent classification, Latent Dirichlet Allocation (LDA) was applied using the GensimPython library³⁴.

In LDA, the intents are viewed as the distribution over the latent that is denoted by the LDA distribution air called Set notation: a₁, a₂, ..., a_c, used in mathematical equations, variable representation. . A pattern is selected based on the intent distribution, which is denoted as θ(multinomial) which defines the given intent, I's probability belongs to the given class C. A Dirichlet distribution is related to β which encodes the pattern into a Bag of words. Given the α and β, it is defined as the multivariate distribution with N words related to M patterns with z intents. The group of N terms is denoted as W, which is given as³⁶:

Probabilistic graphical model equation, featuring joint probability distribution formula. (1)

By integrating over θ, the summation is declared as Z, and the product is taken of the probabilities of the marginal of the individual intents, and the entire intent probability is computed as,

Bayesian probability equation, diagram of statistical inference process, indicating data likelihood. (2)

The patterns, including the intent variations for greetings, are hi, hello, Hey, Good morning/noon/eve, and hola. While receiving these patterns, the system finds the user intent as greetings and responds to them consequently with the defined phrase, such as What can I help you? or How can I help you? If the query is not recognized by the system with the predefined 15 classes, then the system provides restaurant information and suggests that the user communicate with customer service. The outcome of the LDA-based tag modeling of user queries related to greetings is shown in Figure 2. Similarly, the responses are carried out for all 15 classes related to the user intents (Queries). The model LDA has the following parameters. Number of topics (k) is declared as 40, Alpha is fixed as 0.05, Beta value is 0.04, and the number of iterations is declared as 100.

Food suggestion using Bi-NN for ICS

In the Intelligent Catering System (ICS), food items are systematically categorized to support accurate recommendations. Each menu entry may represent either a single item (e.g., burger, snack, drink) or a combination of items such as a meal set (e.g., fried chicken with a cold drink). These items are grouped into six main categories: chicken, burger, snack, drink, suit, and tiffin. For clarity, suit refers to packaged meal sets, while tiffin represents traditional multi-item meals. Each food item is defined by three key features: its price, its category, and a content vector describing its composition. This structured categorization enables the recommendation model to analyze customer purchase histories at both the item and category levels, improving the system's ability to suggest relevant meals and combos that align with user preferences. All the foods in the ICS are a set denoted as, Set theory equation F={f1,f2,...fn}, mathematical symbols. where N denotes the total number of food items in the ICS. For every food product Static equilibrium; ΣFi=0; diagram; physics concept; force balance; educational resource. in F, consists of details of the features of food, and it is denoted as,

Equation in set theory; variables indicate different forces; mathematical expression. (3)

where, Static equilibrium equation f_i^P with subscript and superscript notation for academic study. denotes the prices of food, Static equilibrium equation, ΣFx=0, mathematical formula, educational physics content. denotes the category of the food items where Mathematical formula, categories set notation for classification analysis; equation detail in research. , C denotes the categories. f_i^M mathematical formula, abstract concept, integral in physics calculations, educational use. denotes the number of menus where $Equation of static equilibrium: $f_i^M = \{1,2,3,...,M-1\}$ in mathematical analysis.$ where M denotes the menus Static equilibrium, ΣFx=0 diagram, demonstrating balance of forces concept for research and education. denotes the content vector of food, where $Mathematical equation: function $f_i^v = \{C_1^v, C_2^v, ..., C_k^v\}$.$ where Equation showing Cv_k, thermodynamics concept symbol, related to specific heat capacity analysis. is the element of the content vector and represents chicken, burger, snack, drink, suit, and Tiffin, respectively.

The user data consists of details about the user to denote the user's features that can be obtained from the app or the usage flow. For every user Static equilibrium ΣFx=0, ΣFy=0 balance equations in diagram; forces analysis study setup. , the features are denoted as,

Static equilibrium equation, Ui={uid,uic,uif}, conceptual formula used for equilibrium analysis. (4)

where Mathematical equation with variables u sub i and superscripts for statistical analysis modeling. , denotes the details about the user, such as age and gender. Equilibrium formula: u^c_i; mathematical notation for static equilibrium analysis. denotes the click event of the user, which is a variable-length vector where $Math expression ui^c={ci1,...,cij,...,cit} in symbolic notation$ and t declares the time of the recent click event, $Static equilibrium equation $c_j^i$, mathematical notation for vectors in physics calculations.$ is the positive number and Equilibrium formula: u^c_i; mathematical notation for static equilibrium analysis. denotes the entire flow sequence that denotes the entire purchase carried out by the user. $Static equilibrium, equation $u^f_i$, formula image, educational symbol.$ Denotes the list of food purchased by the customer, where $Set theory formula: element inclusion in set; $u_i^l \in F = \{f_1^i, ..., f_k^i\}$; math concept.$ , k denotes the total amount of purchased food items by the user.

With the use of F and U, which denote food items and user data, respectively, the recommendation system is framed in ICS. The objective of this system is to improve the accuracy to enhance the user purchase intention. The problem of ICS is formulated as,

Static equilibrium equation, R ≈ {ICS(L*)}, symbolic representation, educational formula. (5)

Equation (5) denotes the objective of the problem to find improved accuracy results on ICS problem by reducing the loss function which is denoted in Equation (6).

Optimization equation, L*, for loss function analysis in machine learning algorithm design. (6)

where, $Function notation $ f(X_i, L) $, mathematical formula, educational equation analysis.$ is the predicted results and Static equilibrium; equation ΣFx=0; diagram; force balance study; vector components analysis. is the actual result. The model performs better when the loss function's value is smaller. The loss function is employed to determine the discrepancy between the model's predicted value and the true value Y. In this,

Mathematical notation, sets Xi={x0...xn}, Yi={y0...yn}, formula analysis in research context. (7)

where,

Optimization constraints formula, mathematical equation; variables, sets, and relationships. (8)

Yi symbol, set notation formula, illustrating mathematical concepts in statistical data analysis (9)

In this, r denotes the training data in X, and s is the purchased food item number of one training data. The developed ICS employed cross-entropy as a loss function, which is described in the following section.

Conv-RNN-based food recommendation

By analyzing the product's attributes, user ratings, and user profiles, a Conv-RNN-based recommendation system provides users with recommendations based on their interests. Figure 3 shows the proposed CRNN design. The Conv-RNN models frequently consider or automatically add specific information about the user's temporal context while making suggestions. However, how well a recommender system comprehends and utilizes the context provided by the suggestion requests often determines how effective it is. Conv-RNN calculates prediction ratings based on the dynamic features and attributes of the item and the user's current time context to provide suitable recommendations for a specific user. It is inevitable that people who are going through similar things at the same time will have similar preferences. The effectiveness of a CNN-based time-aware system for recommendations depends on its capacity to find users who are most comparable to the intended receiver and share the same temporal context. Thus, CNN records the temporal context, which is the time-sensitive information about the user's activity. The CNN's input layer was then fed the user attributes, item characteristics, and time information to rebuild the original matrix. A method for calculating the final output is given once the convolution layer has been used to extract features from the matrix.

From the convolution layers, the food click events are extracted using Eqn (10)

Equation for static equilibrium: O=(X+2a−F)/(S+1); educational formula illustration. (10)

where O is the output size, X is the input data size, F is the convolutional kernel size, a is to fill the input data, and S is more than 1 and S is the kernel stride. The neural network can model more complex models than it could if it were restricted to simulating computations between neighboring layers of the network, which it does by using uniformity but only linear operation, because the activation function within the stimulation layer is used to perform nonlinear operations. In a neural network, layer-to-layer communication is strictly sequential. In Conv-RNN, activation functions were the most prevalent. The conventional Tanh, sigmoid, and other types of activation functions lack a gradient and have small, practical interval ranges. When a resource-efficient nonlinear operation is also used, the rectified linear unit (ReLU) functions become the primary instrument for overcoming these two issues.

For computational efficiency, the pooling layer down-samples and sparsely processes feature data. The maximum and average pooling methods are two well-known examples of pooling algorithms; MaxPooling provides better feature selection results. MaxPooling selected the following features:

Static equilibrium equation O=(X-F)/(S-1), educational diagram, balance analysis. (11)

The Conv-RNN then employs the fully connected layer using two dense approaches for retraining the Conv-RNN tail with less feature information loss. Recurrent layers are commonly used in neural networks for the analysis of sequential data. Because of connections that allow them to maintain an internal memory of previous inputs, recurrent layers handle each input separately, unlike traditional feedforward layers. This makes recurrent layers particularly well-suited for tasks involving sequences, time-series data, or any kind of information where the order of inputs matters. The fundamental unit of a recurrent layer is the recurrent neuron, often known as an RNN cell. As RNN cells handle inputs one at a time, they maintain an internal state that contains data from previous inputs. Its internal condition is altered at each time step, which influences the processing of subsequent inputs. The output layer shows the user the results after using the SoftMax classifier. Before using the fields, whose characteristics are categorized as the index of the embedded matrix, they must first be converted to integers.

The categorical cross-entropy loss function, which quantifies the difference between the true class labels y and the projected probability distribution y', was used to train the Conv-RNN recommendation model. Stochastic gradient descent (SGD) with adaptive moment estimation (Adam) was used to improve the model parameters θ (weights and biases). The parameters were modified as follows at each training iteration t:

$Gradient descent formula: θ_{t+1} = θ_t - η∇_θ L(θ_t), used in optimization algorithm studies.$ (12)

where, η is the learning rate, Nabla operator ∇θL equation for gradient calculation, mathematical symbol, formula. is the gradient of the loss function with respect to θ.

The following tactics were used to avoid overfitting. During training, dropout regularization (ρ=0.3) is used on fully linked layers to randomly deactivate neurons. When no improvement was seen for 15 consecutive epochs, training was stopped early based on validation loss.

Five-fold split cross-validation to confirm generalization performance

Using a grid search on the validation set, hyperparameter tuning was carried out. Included in the search space were the Number of filters for the convolution layer, 64, the kernel size, 3 x 3, the Number of Recurrent layers as 100, the Batch size, dropout rate 0.2, and learning rate 0.002. The used optimizer is ADAM. Finally, the output layer returns the results of recommended foods as a one-hot encoding vector, where suggested foods are denoted by one, and other outputs will be denoted as 0.

Customer satisfaction prediction using Conv-LSTM

This study utilized the Convolution Long Short-Term Memory (Conv-LSTM) to forecast customer satisfaction once they have finished their catering. The structure of Conv-LSTM is shown in Figure 4. CNN's architecture includes input neurons, a series of convolutional layers, pooling, completely connected layers, and normalization layers³⁷. The convolution layer's nerve cells are connected to the layer above it through a narrow region, while the activation neurons of the fully linked layers are fully related to the layers below them. Conv-LSTM inputs explicitly define the tensor shapes and temporal granularity. Each input sequence is structured per customer order session (time step = per order), where the purchase list is encoded as a multi-hot vector and the associated satisfaction level is represented as a numerical score. The resulting tensor has the shape (batch size × sequence length × feature dimension).

The forward and backward reverse transmission of a function in CNN generally separates factors into different groups based on their input. Numerous CNN designs have emerged as a result of recent research advancements. As shown in Equation (15), three weights, iw, rw, and b, denote input weight, recurrent weight, and bias, respectively, that have been employed in each LSTM block.

Mathematical formula for circuit current, voltage, and base component variables in diagram. (13)

The following is a declaration of the cell state at time step t:

Long short-term memory (LSTM) cell state equation; neural network formula; data processing method. (14)

where the Hadamard product is denoted by . The code for the hidden state Ht of t is,

Recurrent neural network equation, LSTM formula, Ht=Ot⊙tanh(Ct), educational diagram. (15)

The hyperparameters and their values of Conv-LSTM are declared as follows: The number of filters for the convolution layer is 64, the kernel size is 3 x 3, the LSTM units are 100 with a dropout rate of 0.2, the batch size is 64, the learning rate is 0.002, and the number of time steps is 50 with the Adam optimizer.

UML diagram and Pseudo Code for Customer Interaction

The dynamic flow of interactions in the suggested system is depicted in the UML sequence diagram (Figure 5). The NLP–LDA module processes the user's request (such as a meal order or query) for topic modeling and intent extraction. Following processing, the recommendation engine (Conv-RNN) receives the request and produces a customized recommendation. Lastly, the user receives a real-time response from the system. This sequence guarantees transparency in the conversion of user input into intelligent service outputs and emphasizes the modular interplay of components.

The Conv-RNN recommendation algorithm has been given pseudocode to improve reproducibility. It provides an overview of the sequential computational logic, which includes preprocessing the user request, using convolutional and recurrent layers for sequence modeling and feature extraction, regularization, and a softmax output layer to generate a suggestion. This pseudocode offers a clear implementation-level view of the model workflow, which enhances mathematical formulations.

Pseudo Code: Conv-RNN Recommendation Algorithm

Input: User request U, historical interaction sequence H

Output: Recommended food item R

Preprocess U → tokenize, embed with Word2Vec
Construct input sequence S = [H, U]
Apply Convolutional Layer:
S_conv = Conv1D(S, filters, kernel_size)
Pass through Recurrent Layer (RNN/GRU):
S_RNN = RNN(S_conv, hidden_units)
Apply Dropout for regularization
Dense layer with Softmax activation:
P = Softmax(W xS_rnn + b)
Select recommendation:
R = argmax(P)
Return R to the user

Access restricted. Please log in or start a trial to view this content.

Results

$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

This study thoroughly tested and validated several models to guarantee the authenticity and dependability of the developed ICS. The most efficient setup for ICS was determined by performing a comparative study of several word embedding and classifier combinations. Each experiment was conducted 10x and the results were presented as average values with standard errors enclosed in parentheses. This method brought attention to the model's unpredictability and consistency in performance. The s...

Access restricted. Please log in or start a trial to view this content.

Discussion

$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

The overall performance of the suggested ICS model using AI technologies is compared with the k-means with SVR²⁴, quick service restaurant with LSTM (QSR-LSTM)²⁵, and NLP-ANN³⁸. Comparatively, the proposed model secured a reduced computation time compared to the considered approaches, as shown in Figure 12. As the number of iterations increases, the computation time for all the models increases gradually. The suggest...

Access restricted. Please log in or start a trial to view this content.

Disclosures

$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

The authors have no conflicts of interest.

Acknowledgements

$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

The authors gratefully acknowledge the research support provided by the Faculty of Information Science and Technology, The National University of Malaysia. This work was made possible through the university's internal research funding and academic support infrastructure. The authors also extend their appreciation to colleagues and technical staff for their valuable input during the system design and modeling phase.

Access restricted. Please log in or start a trial to view this content.

Materials

List of materials used in this article
Name	Company	Catalog Number	Comments
Programming Language	Python (used for model development, NLP, and deep learning)	https://www.python.org/	Python 3.8+
Database	MySQL or SQLite (for storing user interaction logs)	https://www.mysql.com/; https://www.sqlite.org/	MySQL 8.0 or SQLite3
Dataset	User queries collected from local restaurant ordering chatbot		Manually annotated
Deep Learning Framework	TensorFlow / Keras	https://www.tensorflow.org/; Keras 2.11 → https://keras.io/	TensorFlow 2.11 or Keras 2.11
Development Environment	Jupyter Notebook / Google Colab	https://jupyter.org/; https://colab.research.google.com/	JupyterLab 3+ / Colab (free)
Evaluation Metrics	scikit-learn metrics: precision, recall, cross-entropy, R²	https://scikit-learn.org/	scikit-learn 1.0+
Natural Language Toolkit	spaCy / NLTK (for intent detection preprocessing)	https://spacy.io/; https://www.nltk.org/	spaCy 3.0 / NLTK 3.6
Recurrent Neural Network Models	RNN, LSTM, Conv-LSTM	https://keras.io/	Implemented in Keras
System Hardware	Intel Core i7, 16GB RAM, NVIDIA GTX 1660 Ti GPU		Local system
Topic Modeling Tool	Gensim (used for Latent Dirichlet Allocation)	https://radimrehurek.com/gensim/	Gensim 4.1.2
Visualization Tools	Matplotlib, Seaborn (for plotting performance graphs)	https://seaborn.pydata.org/; https://matplotlib.org/	Matplotlib 3.5+, Seaborn 0.11
Word Embedding	Word2Vec / GloVe pre-trained embeddings	https://nlp.stanford.edu/projects/glove/	GloVe (100D), Stanford NLP

References

$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

Limna, P., Siripipatthanakul, S., Phayaphrom, B. The role of big data analytics in influencing artificial intelligence (AI) adoption for coffee shops in Krabi, Thailand. Int J Behav Anal. 1, 1-17 (2021).
Sharma, A., Mittal, K., Kumar, S., Sharma, U., Upadhyay, P. Impact of artificial intelligence and machine learning in the food industry: a survey. Artificial Intell Appl Agri Food Quality Improvement. , (2022).
Wikhamn, W. Innovation, sustainable HRM, and customer satisfaction. Int J Hosp Manag. 76, 102-110 (2019).
Goel, P., Kaushik, N., Sivathanu, B., Pillai, R., Vikas, J. Consumers' adoption of artificial intelligence and robotics in hospitality and tourism sector: literature review and future research agenda. Tour Rev. 77, 1-16 (2022).
Singh, V., Archana, T., Singh, A., Tyagi, P. K. Utilizing technology for food waste management in the hospitality industry hotels and restaurants. Sustainable Disposal Methods of Food Wastes in Hospitality Operations. IGI Global. , 287-295 (2024).
Murugeah, M. K. Enhancing efficiency and personalization in food and beverage service through AI: future trends and challenges. Int J Multidimens Res Perspect. 2 (7), (2024).
Nebolisa, C., Abiagom, T., Ijomah, I. Enhancing customer experience through AI-driven language processing in service interactions. Open Access Res J Eng Technol. 7 (1), 14-21 (2024).
Buhalis, D. Technology and service excellence in hospitality: the future is now. J Hosp Mark Manag. 29, 1-9 (2020).
Go, H., Kang, M., Suh, S. C. Machine learning of robots in tourism and hospitality: interactive technology acceptance model (iTAM) - cutting edge. Tour Rev. 75, 625-636 (2020).
Ivanov, S., Webster, C. Robots in tourism: a research agenda for tourism economics. Tour Econ. 26, 1065-1085 (2020).
Paluch, S., Wirtz, J. Artificial intelligence and robots in the service encounter. J Serv Manag Res. 4, 3-21 (2020).
Lukanova, G., Ilieva, G. Robots, artificial intelligence, and service automation in hotels. Robots, Artificial Intelligence, and Service Automation in Travel, Tourism, and Hospitality. Ivanov, S., Webster, C. , Emerald Publishing. 157-183 (2019).
Wen, H., Lee, Y. M. Effects of message framing on food allergy communication: a cross-sectional study of restaurant customers with food allergies. Int J Hosp Manag. 89, 102401(2020).
Resendes, M., et al. Group-level formative feedback and metadiscourse. Int J Comput Support Collab Learn. 10, 309-336 (2015).
Enterprise Apps Today. , https://www.enterpriseappstoday.com (2023).
Naseem, S. The role of tourism in economic growth: empirical evidence from Saudi Arabia. Economics. 9, 2-12 (2021).
Proenca, S., Soukiazis, E. Tourism as an economic growth factor: a case study for Southern European countries. Tour Econ. 14, 791-806 (2008).
Kreishan, F. M. Empirical study of tourism and economic growth of Bahrain: an ARDL bounds testing approach. Int J Econ Finance. 7, 1-9 (2011).
Kim, H., Yang, S. The role of service robots in hospitality: insights from customers, employees, and managers. J Hosp Tour Technol. 12, 539-559 (2021).
Zhe, L., et al. Artificial intelligence in food safety: a decade review and bibliometric analysis. Foods. 12, 1242(2023).
Huiyue, Y., et al. A review of robotic applications in hospitality and tourism research. Sustainability. 14, 10827(2022).
Roumeliotis, K. I., Tselikas, N. D., Nasiopoulos, D. K. Leveraging large language models in tourism: a comparative study of the latest GPT Omni models and BERT NLP for customer review classification and sentiment analysis. Information. 15, 792(2024).
Zhang, Y., Wang, J., Li, S., Chen, X. PURE: A profile updating framework with LLMs for recommender systems. arXiv preprint. , (2025).
Wu, C., Zhao, W. X., Wen, J. R. A survey on large language models for recommender systems. ACM Trans Recomm Syst. , (2024).
Rodrigues, M., Miguéis, V., Freitas, S., Machado, T. Machine learning models for short-term demand forecasting in food catering services: a solution to reduce food waste. J Clean Prod. 435, 140265(2024).
Kim, D. W., et al. Qualitative evaluation of artificial intelligence-generated weight management diet plans. Front Nutr. 11, 1374834(2024).
Nejadshamsi, S., Eicker, U., Wang, C., Bentahar, J. Data sources and approaches for building occupancy profiles at the urban scale - a review. Build Environ. 238, 110375(2023).
Nejadshamsi, S., Bentahar, J., Eicker, U., Wang, C., Jamshidi, F. A geographic-semantic context-aware urban commuting flow prediction model using graph neural network. Expert Syst Appl. 261, 125534(2025).
Nejadshamsi, S., Eicker, U., Bentahar, J., Wang, C. Improving urban-scale building occupancy and energy use estimation using a transportation-informed building occupancy estimation framework. Energy Build. 333, 115468(2025).
Zou, J., Sun, A., Long, C., Kanoulas, E. Knowledge-enhanced conversational recommendation via transformer-based sequential modelling. arXiv preprint. , (2024).
Gambetti, A., Han, Q. AiGen-FoodReview: A multimodal dataset of machine-generated restaurant reviews and images on social media. arXiv preprint. , (2024).
Ju, X., et al. MenuAI: Restaurant food recommendation system via a transformer-based deep learning model. arXiv preprint. , (2022).
Bird, S., Klein, E., Loper, E. Natural Language Processing with Python. , O'Reilly Media. (2009).
Zibarzani, M., et al. Customer satisfaction with restaurant service quality during COVID-19 outbreak: a two-stage methodology. Technol Soc. 70, 101977(2022).
Nilashi, M., et al. Recommendation agents and information sharing through social media for coronavirus outbreak. Telemat Inform. 61, 101597(2021).
Blei, D. M., Ng, A. Y., Jordan, M. I., et al. Latent Dirichlet allocation. J Mach Learn Res. 3, 993-1022 (2003).
Deep residual learning for image recognition. He, K., Zhang, X., Ren, S., Sun, J. Proc IEEE Conf Comput Vis Pattern Recognit. , Las Vegas Valley, NV, , (2016).
Srivastava, N., Mansimov, E., Salakhutdinov, R. Unsupervised learning of video representations using LSTMs. CoRR. , (2015).
IEEE Standards Association. IEEE 7002-2022: IEEE. standard for data privacy process. , IEEE. (2025).

Access restricted. Please log in or start a trial to view this content.

Reprints and Permissions

Request permission to reuse the text or figures of this JoVE article

Request Permission

Functional Requirements and Object-Oriented System Modeling for Designing AI-Driven Intelligent Catering Systems

In This Article

Summary

Abstract

Introduction

Protocol

Results

Discussion

Disclosures

Acknowledgements

Materials

References

Reprints and Permissions

Tags

Related Articles