Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More
Inference-time scaling is one of the big themes of artificial intelligence in 2025, and AI labs are attacking it from different angles. In its latest research paper, Google DeepMind introduced the concept of "Mind Evolution," a technique that optimizes responses of large language models (LLMs) for planning and reasoning tasks.
Inference-time scaling techniques try to improve LLMs' performance by allowing them to "think" more when generating their answers. Practically, this means that instead of generating its answer in one go, a model is allowed to generate several answers, review and correct its answers, and explore different ways to solve the problem.
Evolving LLM responses
Mind Evolution relies on two key components: search and genetic algorithms. Search algorithms are a common component in many inference-time scaling techniques. They allow LLMs to find the best reasoning path for the optimal solution. Genetic algorithms are inspired by natural selection. They create and evolve a population of candidate solutions to optimize a goal, often referred to as the "fitness function."
Mind Evolution algorithm (source: arXiv)
Mind Evolution starts by creating a population of candidate solutions expressed in natural language. The solutions are generated by an LLM that has been given a description of the problem along with useful information and instructions. The LLM then evaluates each candidate and improves it if it does not meet the criteria for the solution.
The algorithm then selects the parents for the next generation of solutions by sampling from the existing population, with higher-quality solutions having a greater chance of being selected. It next creates new solutions through crossover (choosing parent pairs and combining their elements to create a new solution) and mutation (making random changes to newly created solutions). It reuses the evaluation method to refine the new solutions.
The cycle of evaluation, selection and recombination continues until the algorithm reaches the optimal solution or exhausts a preset number of iterations.
Refinement process for proposed solutions in the Mind Evolution algorithm (source: arXiv)
One of the important parts of Mind Evolution is the evaluation function. Evaluators of inference-time scaling techniques often require the problem to be formalized from natural language into a structured, symbolic representation that can be processed by a solver program. Formalizing a problem can require significant domain expertise and a deep understanding of the problem to identify all the key elements that need to be represented symbolically and how they relate to one another, which limits its applicability.
In Mind Evolution, the fitness function is designed to work with natural language planning tasks where solutions are expressed in natural language. This allows the system to avoid formalizing problems, as long as a programmatic solution evaluator is available. It also provides textual feedback in addition to a numerical score, which allows the LLM to understand specific issues and make targeted improvements.
"We focus on evolving solutions in na ...