Through the Thicket: A Study of Number-Oriented LLMS Derived from Random Forest Models

Romaszewski, Michał; Sekuła, Przemysław; Głomb, Przemysław; Cholewa, Michał; Kołodziej, Katarzyna

Through the Thicket: A Study of Number-Oriented LLMS Derived from Random Forest Models

Journal of Artificial Intelligence and Soft Computing Research

Volume 15 (2025): Issue 3 (July 2025)

By:

Michał Romaszewski

, Przemysław Sekuła

, Przemysław Głomb

, Michał Cholewa

and Katarzyna Kołodziej

Open Access

|Mar 2025

Abstract

This paper introduces a novel approach to training Large Language Models (LLMs) using knowledge transfer from a Random Forest (RF) ensemble. By converting RF decision paths into natural language, this method enhances both the classification accuracy and explanation capabilities of LLMs. Our approach integrates three preprocessing techniques: Relation Encoding, Integer Normalisation, and Verbal Description of Values, tailored for numerical data, improving the model’s ability to interpret structured inputs effectively. Leveraging RF’s ensemble properties, we generate rule-based explanations that can be objectively validated, offering a cost-effective alternative to human evaluations. Experiments on well-known datasets demonstrate high classification accuracy highlighting the potential of our framework for numerical and structured data applications. This study also contributes to Explainable Artificial Intelligence (XAI) by providing LLMs with structured, objectively verifiable explanations, making them more accessible and interpretable for real-world decision-making tasks.

DOI: https://doi.org/10.2478/jaiscr-2025-0014

Journal RSS Feed

Language: English

Page range: 279 - 298

Submitted on: Oct 7, 2024

Accepted on: Mar 4, 2025

Published on: Mar 18, 2025

Published by: SAN University

In partnership with: Paradigm Publishing Services

Publication frequency: 4 times per year

Keywords:

classification,

random forest,

large language model,

explainable artificial intelligence

Related subjects:

Computer sciences,

Artificial intelligence,

Databases and data mining

© 2025 Michał Romaszewski, Przemysław Sekuła, Przemysław Głomb, Michał Cholewa, Katarzyna Kołodziej, published by SAN University
This work is licensed under the Creative Commons Attribution 4.0 License.

Previous article Volume 15 (2025): Issue 3 (July 2025)Next article