"ChatGPT in the industry" - Part 3
The intelligent assistant
This part of the article series "ChatGPT in industry" focuses on how the technology of Large Language Models (LLM) is finding its way into industrial products and services in the form of intelligent assistants.
How can user interfaces for complex industrial products and services be kept as simple and easy to understand as possible? In this context, wouldn't it be tempting to have an intelligent assistant at hand at all times? This would understand questions and instructions regardless of language and then provide the precise information required to rectify a fault or for the next optimal operating step.
This desire for "conversational interfaces", i.e. dialog-oriented artificial intelligence, is nothing new. However, the path from simple text-based programs to complex systems based on natural language processing (NLP) and machine learning (ML) has been technically arduous. Many people will probably still remember the talking paperclip in Microsoft Office programs or more recent variants such as Cortana, also from Microsoft, the voice assistant Siri from Apple or Alexa from Amazon. From an initial quality that made us smile rather than being really helpful, voice assistants have established themselves in recent years as an efficient tool for a variety of applications - from customer service to interactive learning platforms.
>> Read part 1 of the article series "ChatGPT in the industry"
But with the release of ChatGPT or gpt-4 from OpenAI, another profound change has taken place in the field of chatbots and conversational interfaces. In addition to a massive leap in quality and flexibility through the use of LLMs for chatbots, the technical implementation is now also much easier. A finely tuned rule system, which used to be the mandatory basis for chatbot behavior, is no longer necessary.
This article focuses specifically on the technical possibilities offered by LLMs, as they open the door wide to dialog-controlled user interfaces for industrial products and services, virtually eliminating the very high barriers to entry that existed previously.
Specific requests with local knowledge
The easiest way would be to build a chatbot directly on one of the large language models (LLMs). For general queries, models such as gpt-4 would probably even provide helpful answers. The disadvantage, however, is that the model cannot know the specific manufacturer and machine-related context or changing operating data. To solve this problem, a technique called Retrieval Augmented Generation (RAG) has been established. The RAG process is summarized as follows:
- Retrieval: When the language model receives a query, it first searches for relevant information in an external database or dataset. This database can consist of a variety of documents, such as operating manuals in PDF format, or selected customer support websites or other sources of information.
- Augmentation: The retrieved information is then used by the language model to inform or "augment" the response. This means that the model is not only based on its internal knowledge acquired during training, but also on specific information retrieved from the database.
- Generation: Finally, the model generates a response or text based on both its own training and the specific information retrieved.
- The advantage of RAG lies in its ability to provide more up-to-date and specific information than a model based solely on the model training data could. This is particularly important in the context of industrial products, where very specific information is required. However, the quality and relevance of the answers is highly dependent on the quality of the data retrieved. The answers are only as good as the data in the background. The strength of artificial intelligence cannot compensate for the gaps in the databases and information.
The complete detailed process of a chatbot based on RAG is as follows (Figure 1):
In the preparation phase (highlighted in green), all relevant documents and information are compiled, read in and converted into vectors using a usually comparatively simple and small "embedding model" and stored in a vector database for later access by the chatbot.
>> Read part 2 of the article series "ChatGPT in the industry"
The process in the case of a user request is symbolized by the blue arrows in the graphic. Several steps now take place here to create the combined prompt. On the one hand, the user request is routed via the embedding model and vectorized in the same way as the information read in beforehand. For this "request vector", the information from the vector database with the greatest number of matches is now determined mathematically. In this step, the 3 - 5 most similar blocks of information are usually returned from the vector database and, depending on the database and system settings, completed with further context. This further increases the information content of the search results. Now the final query prompt, consisting of the original user query, the prompt template and the search results, is compiled and sent to the LLM. This then returns the answer for the user. The prompt template is of particular importance as a collection point for all information.
Example of a prompt template
Behavior and rules: Machine assistant: You are a friendly, supportive assistant designed for optimal machine operation. You answer questions based solely on the information given to you and do not give speculative or opinionated answers. If necessary, you will ask for clarification and further information. You politely decline any questions that are outside the machine-related context. Your main function is to assist operators to maintain the optimum operating point. You provide guidance on general machine operation and give advice on better utilization when requested. In all your answers, you consistently refer to the original data sources given to you or provide the specific reference to the relevant place in the operator manual.
Context: <here comes the context from the vector database>
Question: <here comes the actual user query>
Technical implementation
Figure 2: Exemplary constellations of software agents for solving complex problems with the "autogen" software framework.
© MicrosoftFor technical implementation, there are now providers of modular systems or, for those who want to control their chatbot more precisely, powerful software frameworks with particular strengths and weaknesses. If you want to create a quick proof of concept of a chatbot with local data yourself, you can use the following components, for example. The easy availability and liberal license conditions of the open source components were the decisive reason for the technology selection made here, in addition to the high technical quality:
- User interface with a web visualization for chatbots "Chainlit"
- Data processing with the "llama_index" framework
- Vector database "Postgres" (advantage also suitable for pure keyword searches)
- Embedding model from OpenAi
- gpt-4 as LLM from OpenAi
Chatbot of the next generation
It can be assumed that we will become accustomed to the new convenient and simple user interfaces and that expectations of the possible complexity of the question and also reliability with regard to the expected results will increase rapidly. In addition to stronger language models, another major lever is the expansion of chatbots with more or less visible AI agents with special skills and tools that can correct, supplement and evaluate themselves as a "team" and thus significantly improve the quality of the answers. This is currently a very dynamic field of software development. Alongside other software frameworks, Microsoft's "autogen" framework is particularly well suited to the configuration of agent networks in a wide range of variations (Fig. 2).
In summary, it can be said that the technology is ready to integrate chatbots as a further element for the simple and intuitive operation of complex industrial applications. This category of new conversational interfaces helps to solve basic user problems caused by multilingualism and cultural differences, special technical jargon, a large flood of information or networked contexts more easily. In combination with good graphical user guidance, we can look forward to a whole new level of interaction and natural language conversation with technical systems.
| Use cases of chatbots |
|---|
|
Customer service: Chatbots can answer questions and provide information around the clock, improving customer satisfaction and reducing customer service workload. Employee and operator support: Chatbots can serve as virtual assistants to help employees and machine operators answer questions or perform complex tasks. Personalized learning and entertainment experiences: Chatbots can cater to the individual needs and preferences of users, providing a personalized experience. |
















