Abstract


Optimization problems often hide many traps, a search algorithm must jump over local hills and avoid deep valleys. Classic metaheuristics such as Genetic Algorithm, Particle Swarm Optimization, Differential Evolution, Ant Colony Optimization operates with fixed rules. However fixed rules can fail under dynamic conditions. They may freeze too early or need heavy tuning. Several researchers have tried two fresh ideas: self-adaptation (the rules tune themselves) and reinforcement learning (an agent learns which rule to apply). This review surveys recent work and found clear growth: Q-learning PSO for path planning, PPO-guided DE for multiobjective tasks, and meta-controllers that switch among many search moves. Most studies report better accuracy and faster convergence on CEC, BBOB, and WFG testbeds. Yet open issues remain such as results do not always transfer to other problems, large RL agents add cost and many works still lack fair runtime limits. This review maps the state of the art, groups the methods into simple classes and lists gaps.




Keywords


Metaheuristic, Self-Adaptive Optimization, Reinforcement Learning, Dynamic Parameter Control, Premature Convergence