Metrics

Metric collection in radiate is interwoven into every aspect of the evolutionary process. It uses the Kahan summation algorithm paired with Welford's one-pass online algorithm for fast, accurate, and numerically stable computation of statistics. All of this combined provides robust and reliable metric tracking throughout the evolutionary process. Using the MetricSet (a collection of independent Metrics) we can collect a whole host of statistics that span the entire evolutionary process allowing us to gain deep insights into the evolutionary dynamics.

MetricSet

The MetricSet is an object (struct) provided to the user in two main forms:

On the engine's Generation - given to the user after each epoch or each pass of the evolution process.
Through the engine's eventing system. Various events emit metric data allowing the user to track metrics or derive their own in real-time.

A metric is essentially a statistic with a name and some extra metadata attached to it. The Statistic exposes a number of different statistical measures that can be used to summarize the data, such as, last_value, count, min, max, mean, sum, variance, std_dev, skewness, and kurtosis.

There are a few different types of metrics that can be collected:

Numeric Metric: A plain old metric that collects single point numeric data each generation and aggregates it over the entire evolutionary run. For example, the rate.diversity metric collects the diversity rate each generation and adds it to the previous generation's metrics.
Duration Metric: A metric that collects timing information for various components of the engine. For example, the time.evaluation metric collects the time taken to perform evaluations each generation and adds it to the previous generation's metrics. When accessed, it should be noted that when calling metric.time() the underlying statistic will convert the numerical data to a Duration object, which provides methods for accessing the time in different units (e.g., seconds, milliseconds, etc.).
Distribution Metric: A metric that collects a distribution of data each generation, replacing the previous generation's data. For example, the scores metric collects the scores of all individuals in the population each generation, replacing the previous generation's scores. This means that each generation, the metric reflects only the current generation's state, nothing before it.

Collection

Along with the default metrics, each component will also collect metrics for the operations it performs. For example, each Alterer and Selector will collect metrics and be identified by their name. A few types of metrics will only be included if the parts of the engine which produce them are included. For instance, species level metrics will only be collected if the engine is configured to use species-based diversity. Also, front level metrics will only be collected if the engine is configured for multi-objective optimization.

Note this does not include all possible metrics. Certain metrics are only collected under specific conditions or configurations, such as when using species-based diversity or multi-objective optimization.

Default Metrics

Metrics collected by default (always included):

Name	Description
`index`	The index of the generation. This is pretty much just for reference if you want to keep a log per generation.
`time`	The time taken for the evolution process.
`scores`	The scores (fitness) of all the individuals evolved throughout the evolution process.
`scores.best`	The best score found so far over the course of the run.
`scores.evenness`	Pielou evenness of the population's fitness distribution, in `[0, 1]`. `~1.0` means fitness is spread evenly across distinct scores (healthy, exploring); `~0.0` means the population has collapsed onto a plateau (premature convergence).
`scores.gini`	Gini coefficient of the population's fitness distribution, in `[0, 1]` — a measure of fitness inequality / effective selection pressure. `0.0` is perfect equality (fully converged); high (`~0.7+`) means a small elite holds most of the fitness mass.
`age`	The age of all the individuals in the `Ecosystem` throughout the evolution process.
`genome.size`	The size of each genome over the evolution process. This is usually static and doesn't change.
`replace.age`	The number of individuals replaced based on age.
`replace.invalid`	The number of individuals replaced based on invalid structure (e.g. Bounds).
`unique.members`	The number of unique members in the `Ecosystem`.
`unique.scores`	The number of unique scores in the `Ecosystem`.
`new.children`	The number of new children created each generation through either mutation or crossover (or both).
`count.survivor`	The number of individuals that survived to the next generation - summation throughout the evolution process.
`count.evaluation`	The total number of evaluations performed per generation.
`rate.carryover`	The rate at which unique individuals are carried over to the next generation - `survivor_count` per generation / population size.
`rate.diversity`	The ratio of unique scores to the size of the `Ecosystem`.
`score.volatility`	The volatility of the scores in the `Ecosystem`. This is calculated as the standard deviation of the scores / mean.
`score.improvement`	The improvement of the best score from the previous generation to the current generation - either a 1 or 0 each generation.

A few default metrics are only collected when the relevant data exists:

Name	Description
`genome.size.score.corr`	Pearson correlation between genome size and fitness across the population, in `[-1, 1]` — the bloat signal. Only emitted when genome length actually varies (variable-length GP genomes); for fixed-length genomes there is no size variance and the metric is omitted.

Multi-objective naming

The per-dimension metrics — scores, scores.best, scores.evenness, scores.gini, unique.scores, and genome.size.score.corr — are emitted under their bare name for single-objective runs. Under multi-objective optimization they gain a numeric suffix per objective instead (scores.0, scores.1, …, scores.best.0, scores.best.1, and so on).

Multi-objective Metrics

Additional metrics collected when using multi-objective optimization:

Name	Description
`front.additions`	The number of members added to the Pareto front each generation.
`front.removals`	The number of members removed from the Pareto front each generation.
`front.size`	The size of the Pareto front each generation.
`front.comparisons`	The number of comparisons made to update the Pareto front each generation.
`front.filters`	The number of times the Pareto front was filtered each generation.
`front.entropy`	The entropy of the Pareto front throughout the evolution process - only calculated every 10 generations (its kinda an expensive calculation).

Species-based Metrics

Additional metrics collected when using species-based diversity:

Name	Description
`species.count`	The number of `species` in the `Ecosystem`.
`species.new`	The number of `species` created in the `Ecosystem`.
`species.new.ratio`	The ratio of new species created each generation.
`species.fail.empty`	The number of `species` that have died (emptied out) in the `Ecosystem`.
`species.age`	The age of all the `species` in the `Ecosystem`.
`species.fail.age`	The count of species that have failed based on age each generation.
`species.size`	The distribution of species sizes (number of members per species) each generation.
`species.distance`	The distribution of compatibility distances used when assigning members to species.
`species.threshold`	The current compatibility threshold used to decide species membership.
`species.evenness`	The evenness of the species distribution in the `Ecosystem`.
`species.largest_share`	The share of the largest species in the `Ecosystem`.

Accessing Metrics

These can be accessed through the metrics() method of the Generation object, which returns a MetricSet. Each individual Metric can be accessed by its name, and the various statistical measures can be accessed through the methods of the Metric object. Additionally, the MetricSet provides a dashboard() method that pretty-prints all the metrics in a user-friendly format.

Python Rust

import radiate as rd

# Create an engine with 3 chromosomes each with 2 genes (i.e. a 3x2 matrix)
engine = (
    rd.Engine.float([2, 2, 2], init_range=(0.0, 1.0))
    .fitness(my_fitness_fn)  # Single objective fitness function
    .limit(rd.Limit.generations(100))
    # ... other parameters ...
)

# Run the engine for 100 generations
result = engine.run()

# Get the metrics of the engine
metrics = result.metrics()  # MetricSet object
df = (
    metrics.to_polars()
)  # Convert metrics to a Polars DataFrame for analysis (if installed)
df = (
    metrics.to_pandas()
)  # Convert metrics to a Pandas DataFrame for analysis (if installed)

# Access specific metrics
carry_over = metrics[
    "rate.carryover"
].max()  # Maximum carryover rate throughout evolution

scores = metrics["scores"]
score_mean = scores.mean()
score_stddev = scores.stddev()
score_variance = scores.variance()
score_min = scores.min()
score_max = scores.max()
score_count = scores.count()
score_skew = scores.skew()
score_sum = scores.sum()

time = metrics["time"]

total_time = time.time_sum()
mean_time = time.time_mean()
stddev_time = time.time_stddev()
variance_time = time.time_variance()
min_time = time.time_min()
max_time = time.time_max()

# pretty-print the metrics dashboard
print(metrics.dashboard())

// --- set up the engine ---

let result = engine.run(|generation| {
    // get the score metric from the generation context
    let temp = generation.metrics().get("scores").unwrap();
    // get the standard deviation of the score distribution
    let std = temp.stddev();

    std < 0.01 // Example condition to stop the engine
});

// Access the metrics from the result
let metrics: &MetricSet = result.metrics();

// pretty-print the metrics dashboard
println!("{}", metrics.dashboard());

Tags

All metrics have a sort of metadata which identifies them based on their characteristics or where they originate from. This can be used to filter and group metrics based on similar traits. For example, metrics related to time will have the time tag, while metrics related to mutators will have the mutator tag.

Python Rust

import radiate as rd

# Create the evolution engine
engine = (
    rd.Engine.bit(10)
    .fitness(fitness_function)
    .limit(rd.Limit.score(0.01), rd.Limit.generations(1000))
)

# Run the engine
result = engine.run()

# Access the metrics from the result
metrics = result.metrics()

# Get tags for a specific metric
tags = metrics[
    "scores"
].tags()  # e.g., ['rd.Tag.SCORE', 'rd.Tag.STATISTIC', 'rd.Tag.DISTRIBUTION']

for metric in metrics.values_by_tag(rd.Tag.ALTERER):
    ...  # access all metrics related to alterers (crossover, mutation) ...

// Create the evolution engine

let mut engine = GeneticEngine::builder()
    .codec(IntCodec::vector(10, 0..100))
    .minimizing()
    .fitness_fn(|geno: Vec<i32>| geno.iter().sum::<i32>())
    .build();

// Run the engine
let result = engine.run(|generation| generation.index() >= 1000);

// Access the metrics from the result
let metrics: &MetricSet = result.metrics();

// Get tags for a specific metric
let tags = metrics.get("scores").unwrap().tags(); // [TagType::Score, TagType::Statistic, TagType::Distribution]
for metric in metrics.iter_tagged(TagType::Alterer) {
    // ... access all metrics related to alterers (crossover, mutation) ...
}

// Collect unique tags contained in the MetricSet
let unique_tags = metrics.tags().collect::<Vec<_>>();

Tags available:

Tag	Description
`Selector`	Metrics related to selection mechanisms.
`Alterer`	Metrics related to alteration mechanisms (mutation, crossover).
`Mutator`	Metrics specifically related to mutation operations.
`Crossover`	Metrics specifically related to crossover operations.
`Species`	Metrics related to species-based diversity.
`Failure`	Metrics related to failures (e.g., invalid individuals).
`Age`	Metrics related to age-based operations.
`Front`	Metrics related to multi-objective optimization fronts.
`Derived`	Metrics that are derived from other metrics.
`Other`	Miscellaneous metrics that don't fit into other categories.
`Statistic`	Metrics that provide statistical measures.
`Time`	Metrics related to time measurements.
`Distribution`	Metrics that describe distributions (e.g., scores, ages).
`Score`	Metrics specifically related to fitness scores.
`Rate`	Metrics that represent rates (e.g., carryover rate).
`Step`	Metrics related to the different steps or phases of the evolutionary process (evaluation, recombination, filtering, etc.).
`Expr`	Custom metrics or expressions supplied by the user.