Species
A species is a cluster of genetically-similar individuals. When a distance measure is attached to the engine, radiate groups the population into species each generation and lets them compete as groups rather than as a single pool. By doing this, we protect a promising-but-immature lineage from being wiped out before it has a chance to refine. This is the core idea behind NEAT-style speciation, and radiate applies this to all of speciation.
This page covers what a species is and how the engine forms and uses them. For the distance measures that decide who is "similar," see the last section, Distance.
Speciation is opt-in
Speciation only runs if you provide a diversity measure (see Diversity). Without one, the engine evolves a single flat population and none of the machinery below is active.
What a species holds
Each species tracks a small amount of state across generations:
| Field | Meaning |
|---|---|
| mascot | A representative individual, re-chosen at random from the species' members every generation. Membership for the next generation is decided by distance to this mascot. |
| members | The individuals currently assigned to the species. For what it's worth, this is the individual's PhenotypeId, not the entire individual. |
| generation | The generation the species was founded which is used to compute its age. |
| best / stagnation | The species' best score so far and how many generations it has gone without improving it. |
| adjusted score | The species' fitness-shared score, used to decide how many offspring it earns (see below). |
The randomly-chosen mascot is deliberate: it keeps a species from being anchored to one fixed individual and lets the cluster drift as the population evolves.
The speciation lifecycle
Every generation the engine runs a speciation step that re-forms species from scratch against the previous generation's mascots, then prunes and scores them:
flowchart TD
A[Pick a new mascot at random<br/>for each existing species] --> B[Resolve the current threshold<br/>from its Rate schedule]
B --> C{For each individual:<br/>distance to a mascot < threshold?}
C -->|yes| D[Join that species<br/>first match wins]
C -->|no| E[Find nearest species<br/>within threshold]
E -->|found| D
E -->|none| F[Found a new species]
D --> G[Prune empty species and<br/>species older than max_species_age]
F --> G
G --> H[Fitness sharing:<br/>compute each species' adjusted score]
H --> I[Recombine: allocate offspring<br/>per species by adjusted score]
- Mascots are refreshed. Each existing species picks a new mascot uniformly at random from its current members.
- The threshold is resolved.
species_thresholdis aRate, so it may be a constant or change over generations (see Adaptive thresholds). - Individuals are assigned. Each individual is compared to the mascots; the first species whose mascot is within
species_thresholdclaims it. This can run in parallel depending on the configured executor. - Leftovers settle. An unassigned individual joins its single nearest species if one sits within the threshold; otherwise it founds a new species and becomes its mascot.
- Dead and stale species are removed. Species with no members disappear, and the filter step culls any species whose age exceeds
max_species_age. - Fitness is shared and offspring are allocated — the next two sections.
Fitness sharing and offspring allocation
Rather than letting the globally-fittest individuals dominate reproduction, the engine shares fitness within a species and hands out offspring between species in proportion to how each species is doing on average.
Fitness sharing. A species' raw fitness is divided across its members, so being in a crowded species dilutes each member's contribution. Concretely, the species' adjusted score is the average of its members' scores, then normalized across all species so the adjusted scores form a distribution:
where \(S\) is the species size. The effect: a large species must be better on average to keep its share, which discourages any one cluster from swamping the population.
Offspring allocation. During recombination the total offspring budget is split into per-species quotas proportional to each species' normalized adjusted score (largest fractional remainders get the leftover slots if needed). Selection and alteration then happen within each species' sub-population:
- A higher-scoring species earns a larger quota of offspring.
- Survivors are still selected globally, but offspring are bred per-species, so young or unusual species get protected breeding room instead of competing head-to-head with established ones.
This is really the core algorithm that allows novel structures to survive long enough to mature.
Tuning knobs
Species threshold
The threshold sets how close two individuals must be — under the chosen distance measure — to share a species. It is the single most impactful speciation knob.
let engine = GeneticEngine::builder()
.codec(CharCodec::vector(10))
.fitness_fn(your_char_fit_fn)
// A distance measure turns speciation on; the threshold sets how close
// two individuals must be (per the measure) to share a species.
.diversity(HammingDistance)
.species_threshold(0.5)
// ... other parameters ...
.build();
As a general rule, the species_threshold follows the below pattern:
A lower threshold → individuals must be very similar to group → more, smaller species → more diversity, slower convergence.
A higher threshold → loose grouping → fewer, larger species → less diversity, faster convergence.
Because the right value is measure-dependent, the practical approach is to set it, watch how many species form, and adjust until you get a handful of meaningful clusters rather than one giant species or hundreds of singletons.
Adaptive thresholds
Since species_threshold accepts a Rate, it can change over the run — for example starting tight to explore many niches, then widening to let the population consolidate:
import radiate as rd
# `species_threshold` accepts a `Rate`, so it can change over generations.
# Here it widens from 0.3 to 0.9 across the first 100 generations: start
# fine-grained (many small species), then coarsen to encourage convergence.
engine = (
rd.Engine.float(2)
.fitness(your_fitness_func)
.diversity(
rd.Dist.euclidean(),
species_threshold=rd.Rate.linear(start=0.3, end=0.9, duration=100),
)
)
// `species_threshold` accepts a `Rate`, so it can change over generations.
// Here it widens from 0.3 to 0.9 across the first 100 generations: start
// fine-grained (many small species), then coarsen to encourage convergence.
let engine = GeneticEngine::builder()
.codec(FloatCodec::vector(2, -1.0..1.0))
.fitness_fn(your_fitness_fn)
.diversity(EuclideanDistance)
.species_threshold(Rate::Linear(0.3, 0.9, 100))
// ... other parameters ...
.build();
The threshold can also be driven by live metrics via an expression — see Expressions.
Target Species Count
The engine's target_species_count acts as a soft target for the number of species to maintain. Internally, it replaces the species_threshold with an Expression that nudges the threshold up or down based on how many species are currently active — if there are too many species, the threshold rises to encourage consolidation; if there are too few, it drops to encourage diversification.
import radiate as rd
# `target_species` is an alternative to `species_threshold` that tries to maintain a certain number of
# species. The engine will adjust the threshold up or down as needed to try to meet the target count.
engine = (
rd.Engine.float(2)
.fitness(your_fitness_func)
.diversity(rd.Dist.euclidean(), target_species=4)
)
# This is equivalent to setting the `species_threshold`
# (previous section) to an expression like so (this is actually what the engine does
# under the hood when you set `target_species`):
initial_threshold = 0.5 # <- note that this is the default species_threshold
target_species = 4
species_threshold = (
rd.Expr.when(rd.Expr.select("index") < 2)
.then(initial_threshold)
.otherwise(
(rd.Expr.select("species.count").error(target_species) * 0.05)
+ rd.Expr.select("species.threshold")
)
)
let engine = GeneticEngine::builder()
.codec(FloatCodec::vector(2, -1.0..1.0))
.fitness_fn(your_fitness_fn)
.diversity(EuclideanDistance)
// Instead of a distance threshold, you can specify a target number of species.
.target_species(4)
// ... other parameters ...
.build();
// Note that this is exactly the same as setting the species_threshold to an expression like so:
let count = 4;
let curr_threshold = 0.5; // This would be the initial threshold if we were to use a static threshold instead of target_species_count.
let species_threshold = Expr::when(Expr::select(metric_names::INDEX).lt(2))
.then(curr_threshold)
.otherwise(
(Expr::select(metric_names::SPECIES_COUNT).error(count as f32) * 0.05)
+ Expr::select(metric_names::SPECIES_THRESHOLD),
);
Maximum species age
A species that goes max_species_age generations without improving its best score is considered stagnant and removed; its members sit out crossover and mutation for that generation. This frees the offspring budget for species that are still making progress.
import radiate as rd
engine = (
rd.Engine.float(2)
.fitness(your_fitness_func)
.diversity(rd.Dist.euclidean(), species_threshold=0.5)
# A species that survives this many generations without improving its best
# score is culled, and its members sit out crossover/mutation that generation.
.age(max_species_age=25)
)
let engine = GeneticEngine::builder()
.codec(FloatCodec::vector(2, -1.0..1.0))
.fitness_fn(your_fitness_fn)
.diversity(EuclideanDistance)
.species_threshold(0.5)
// A species that survives this many generations without improving its best
// score is culled, and its members sit out crossover/mutation that generation.
.max_species_age(25)
// ... other parameters ...
.build();
The default is 25 generations. Increase it for hard problems that need more time to refine a niche; decrease it to clear out stagnant clusters faster.
Common pitfalls
-
Threshold scaled wrong for the measure.
- Symptom: one species containing everyone, or a new species for nearly every individual.
- Fix: re-scale
species_thresholdto your distance measure's range — a value that works for Hamming ([0, 1]) is meaningless for an unbounded Euclidean distance.
-
Premature convergence.
- Symptom: the population locks onto a suboptimal solution early.
- Fix: lower the threshold (more species), raise
max_species_age, or increase the mutation rate.
-
Failure to converge.
- Symptom: the population stays scattered and never settles.
- Fix: raise the threshold (fewer species), lower
max_species_age, or increase selection pressure.
-
Stagnation.
- Symptom: improvement flatlines.
- Fix: lower
max_species_ageto recycle stale species, raise the mutation rate, or try an adaptive threshold.