Font size

A
A

Line spacing

A
A

Color

A
A
3 июня 2026 г.

Question — in a second: SFedU scientists have accelerated the search for lost cargo tenfold

3 июня 2026 г.

Logistics companies process huge amounts of data every day: shipping invoices, warehouse reports, traffic reports, weather reports, and accidents.

In order to answer a simple question, when will the cargo arrive, the operator has to look through several disparate sources and spend a lot of time on it. And artificial intelligence, answering such questions, often cannot give an objective assessment of the situation and tends to produce plausible but unreliable information (the so-called "hallucinations" of neural networks).

Scientists at Southern Federal University have proposed a method that allows processing logistics-related queries in natural language ten times faster than graph search (based on connections without understanding meaning) and at the same time gives answers more accurately than with vector search (based on meaning without understanding connections). At the center of the development is the so–called RAG (Retrieval-Augmented Generation) architecture, when the language model does not generate an answer "from the head", but first finds relevant fragments in an external knowledge base.

"The main problem is related to the fact that logistics systems work with a large amount of heterogeneous and rapidly updating data: information about warehouses, routes, transport, traffic conditions, weather, applications and documents. The user often needs not just a keyword search, but an answer based on the relationships between objects. Conventional search algorithms do not handle such queries well because they do not always take into account the structure of the logistics network. Voice assistants simplify interaction, but by themselves they do not provide accurate access to the operational data of a particular system and the connections between them," said Vadim Voloshchuk, one of the authors of the study, a graduate student at the Institute of Computer Technology and Information Security of the Southern Federal University.

The system works like this. If the user directly names a specific object (for example, "warehouse number five"), the program instantly outputs all the information related to it. If the query is formulated in free form, the computer finds several reference nodes that are closest in meaning, starts traversing the graph from each of them to a given depth, collects all related objects and ranks them according to a combined rule.

"The vector representation helps to find objects that are close to the query in meaning, even if the user formulates the question differently than it is recorded in the database. The graph part preserves connections between objects: warehouses, shops, roads, transport, weather conditions, and other logistics elements. Combining them allows you to first find meaningfully similar objects, and then refine the result taking into account the real dependencies in the system. The risk of "hallucinations" is reduced due to the fact that the language model does not receive a large set of disparate texts, but a compact and connected one," added Yaroslav Melnik, a graduate student at SFedU.

The developers tested the method on synthetic data modeling logistics networks of various scales, ranging from relatively small (about five hundred key facilities and tens of thousands of connections) up to large ones (over two thousand objects and hundreds of thousands of connections). Three approaches were compared: graph, vector, and hybrid search. Accuracy was assessed by the proportion of correct answers among the top three results. Hybrid search turned out to be twice as accurate as graph search and one and a half times more accurate than vector search.

"The parameters of the contribution of semantic proximity to the query and the contribution of proximity in the graph are the most significant. There are also auxiliary parameters that help to take into account the local and general importance of the node in the network. The method is sensitive to these settings, but not critical: they need to be tailored to the type of task. For more textual queries, semantics is more important; for routes and dependencies, the graph part is more important," Vadim Voloshchuk said.

The practical value of the development is obvious for any business related to transportation and warehouse logistics. The call center operator can ask the system: "What shipments were delayed last week due to the weather?" – and get an accurate answer without spending hours manually parsing reports. When the dispatcher sees a message about an accident on the highway, he instantly learns which flights and which stores are under attack. The system can warn you on its own: "The delivery of dairy products to store 12 will be delayed due to rain on the M-4 section."

"First of all, it can be useful as an intelligent search or consulting layer on top of warehouse management systems (WMS), transport management systems (TMS) and analytical systems. For practical implementation, you need to adapt the data model to a specific system, connect real data sources, configure search parameters, and evaluate the quality of responses together with domain experts. At the first stage, it is reasonable to consider the method not as a replacement for existing systems, but as an additional module for more convenient access to related data," Yaroslav Melnik noted.

The article was published in the international peer-reviewed journal Big Data and Cognitive Computing, and the authors were employees of the Institute of Computer Technology and Information Security, the Research Institute of Robotics and Control Systems and the Research Institute of Smart Materials of the Southern Federal University.

Short link to this page sfedu.ru/news/80776

Additional materials on the topic