Abstract
Uncertainty in decision-making processes presents a critical challenge for autonomous agents, often leading to suboptimal or erroneous policies. This paper addresses two prevalent yet distinct types of uncertainty that significantly degrade agent performance: fuzzy uncertainty, stemming from ambiguous task boundaries, and gray uncertainty, arising from noisy or incomplete state observations. To tackle these challenges, we propose the Dual-Task-State Inference (DTS-Infer) method, a novel framework that leverages variational inference within an off-policy reinforcement learning structure. DTS-Infer utilizes a dual-network architecture to explicitly disentangle and resolve these uncertainties: (1) a task inference network learns a latent distribution over tasks from historical data to disambiguate task goals, thereby solving the fuzzy uncertainty problem ; and (2) a state inference network captures robust latent features of the current state to overcome corrupted sensory input, thus addressing gray uncertainty. Extensive experiments on continuous control benchmarks demonstrate that DTS-Infer significantly outperforms state-of-the-art algorithms. For instance, in the Half-Cheetah-Fwd-Back environment, DTS-Infer achieved a final average reward of 1612.61, representing an 18.9% improvement over the PEARL algorithm. Furthermore, ablation studies confirmed that our inference modules contribute to an 80% increase in average reward over a standard TD3 baseline, highlighting the method's effectiveness in enhancing the robustness and adaptability of intelligent agents.