# Comparison of nTLJ Scenarios

### [nTLJ Notebook Developer Guide](https://docs.google.com/document/d/1dEIARDkgfaHBER-6ADQZpUWCaG5ChSLT4KVjq3OSzTI) | [nTLJ Metrics Design Doc](https://docs.google.com/document/d/1jQ7dQG477uDQww7jvYNvIcGwD9H2VwXNTz5GJu7Zkqc) | [Jupyter Notebook Developer Guide](https://git.zooxlabs.com/zooxco/driving/blob/master/mined_metric/jupyter/README.md)

## Table of Contents
1. [Data Loading and Configurations](#1.-Data-Loading-and-Configurations)
2. [Metrics Summary](#2.-Metrics-Summary)
3. [Comparison of SPC Metrics](#3.-Comparison-of-SPC-Metrics)
  - 3.1 [Comparison of Safety Metrics](#3.1-Comparison-of-Safety-Metrics)
  - 3.2 [Comparison of Comfort Metrics](#3.2-Comparison-of-Comfort-Metrics)
  - 3.3 [Comparison of Progress Metrics](#3.3-Comparison-of-Progress-Metrics)
4. [Comparison of Behaviors](#4.-Comparison-of-Behaviors)
  - 4.1 [Comparison of average ticks to stable yield](#4.1-Comparison-of-average-ticks-to-stable-yield)
  - 4.2 [Comparison of max flicker count](#4.2-Comparison-of-max-flicker-count)
  - 4.3 [Comparison of straight probability](#4.3-Comparison-of-straight-probability)
5. [Validations for ZRSS Metrics (For ZRSS developers only)](#5.-Validations-for-ZRSS-Metrics-(For-ZRSS-developers-only))
  - 5.1 [Filtered Time to Collision](#5.1-Filtered-Time-to-Collision)
  - 5.2 [RSS Time to Collision](#5.2-RSS-Time-to-Collision)
  - 5.3 [Severity](#5.3-Severity)
6. [Problematic Scenarios](#6.-Problematic-Scenarios)
  - 6.1 [Simulation Failures](#6.1-Simulation-Failures)
  - 6.2 [Not Completed Autonomously](#6.2-Not-Completed-Autonomously)
7. [Metrics Conclusion](#7.-Metrics-Conclusion)

## 1. Data Loading and Configurations

In [None]:
import json
import numpy as np

from mined_metric.jupyter.utils.data_access_util import *
from mined_metric.jupyter.utils.data_processing_util import *
from mined_metric.jupyter.utils.ntlj_vis_util import *
from mined_metric.jupyter.utils.validation_vis_util import *
from mined_metric.jupyter.utils.ntlj_metrics_conclusion_util import *

%matplotlib inline

EXP_ID_CONTROL = "EXP_ID_CONTROL_PLACEHOLDER"
EXP_ID_CANDIDATE = "EXP_ID_CANDIDATE_PLACEHOLDER"
VALIDATION_ID = "VALIDATION_ID_PLACEHOLDER"

# Metric configurations file
CONFIG_PATH = "/mnt/sun-pcs01/jupyterhub/nTLJ/default_ntlj_config [new].json"

In [None]:
# Retrieve all data for both candidate and control branches
if "PLACEHOLDER" not in VALIDATION_ID:
    exp_data_control, exp_data_candidate = get_validation_data(VALIDATION_ID)
else:
    exp_data_control = get_experiment_data(EXP_ID_CONTROL)
    exp_data_candidate = get_experiment_data(EXP_ID_CANDIDATE)
print("Control Git SHA: {}".format(exp_data_control["experiment"]["gitsha"]))
print("Candidate Git SHA: {}".format(exp_data_candidate["experiment"]["gitsha"]))
print("Make sure the control and candidate Git SHAs are not swapped! (If they are, switch them.)")

In [None]:
# Read configurations for metrics
with open(CONFIG_PATH) as f:
    configs = json.load(f)
configs_all = [c for section in configs.values() for c in section]

# Read configurations for Argus videos
with open("/mnt/sun-pcs01/jupyterhub/argus/argus_side_by_side_layout.json") as f:
    argus_layout = json.load(f)

## 2. Metrics Summary

In [None]:
# Calculate basic Chum URI summary statistics
chum_uris, meta_data_control, meta_data_candidate = parse_meta_data_for_validations(exp_data_control, exp_data_candidate)
scenario_lookup = get_scenario_lookup(chum_uris)

# Insert scenario IDs of scenarios that should be excluded from the metric computations. 
# Typically flaky or unrealistic failures are listed here.
scenario_ids_to_exclude = []
chum_uris = remove_scenarios_from_metric(scenario_ids_to_exclude, scenario_lookup, chum_uris, meta_data_control, meta_data_candidate)

print("Total number of scenarios: {}".format(len(chum_uris)))

In [None]:
# Violin chart of metric deviations
df = create_metric_aggregation_dataframe(configs_all, meta_data_control, meta_data_candidate)
plot_metric_summary_violin_chart(configs_all, meta_data_control, meta_data_candidate)

## 3. Comparison of SPC Metrics

### 3.1 Comparison of Safety Metrics

In [None]:
selected_chums_safety = []
create_metric_aggregation_dataframe(configs["safety"], meta_data_control, meta_data_candidate)

#### 3.1.1 Time to Collision

In [None]:
# Deviation in time to collision
config = [get_config(configs_all, "min_time_to_collision")]
html, ttc_metrics_bins = vis_ntlj_histograms(config, meta_data_control, meta_data_candidate, subplots=True)
html

In [None]:
# Scenarios in a specific bar in the histogram of deviations in time to collision
bin_num = "lowest" # valid values are 1-9, "lowest", and "highest"
bin_largest_regression = get_histogram_bin_bounds(ttc_metrics_bins, "min_time_to_collision", bin_num)
compare_select_scenarios(
    "min_time_to_collision",
    meta_data_control,
    meta_data_candidate,
    [vis_time_vs_ttc],
    "rel_diff_between",
    scenario_lookup,
    deviation_range=bin_largest_regression,
    limit=3,
    argus_layout=argus_layout,
    selected_chums=selected_chums_safety,
)

#### 3.1.2 Car Heat Map Cost

In [None]:
# Deviation in car heat map costs
config = [get_config(configs_all, "car_heat_map_unweighted_cost_sum")]
html, car_heatmap_metrics_bins = vis_ntlj_histograms(config, meta_data_control, meta_data_candidate, subplots=True)
html

In [None]:
# Scenarios in a specific bar in the histogram of deviations in car heat map cost
bin_num = "highest" # valid values are 1-9, "lowest", and "highest"
bin_largest_regression = get_histogram_bin_bounds(car_heatmap_metrics_bins, "car_heat_map_unweighted_cost_sum", bin_num)
compare_select_scenarios(
    "car_heat_map_unweighted_cost_sum",
    meta_data_control,
    meta_data_candidate,
    [vis_time_vs_car_heatmap_cost],
    "rel_diff_between",
    scenario_lookup,
    deviation_range=bin_largest_regression,
    limit=3,
    argus_layout=argus_layout,
    selected_chums=selected_chums_safety,
)

#### 3.1.3 Ped/Bike Heat Map Cost

In [None]:
# Deviation in ped/bike heat map costs
config = [get_config(configs_all, "ped_bike_heat_map_unweighted_cost_sum")]
html, ped_heatmap_metrics_bins = vis_ntlj_histograms(config, meta_data_control, meta_data_candidate, subplots=True)
html

In [None]:
# Scenarios in a specific bar in the histogram of deviations in ped/bike heat map cost
bin_num = "highest" # valid values are 1-9, "lowest", and "highest"
bin_largest_regression = get_histogram_bin_bounds(ped_heatmap_metrics_bins, "ped_bike_heat_map_unweighted_cost_sum", bin_num)
compare_select_scenarios(
    "ped_bike_heat_map_unweighted_cost_sum",
    meta_data_control,
    meta_data_candidate,
    [vis_time_vs_ped_bike_heatmap_cost],
    "rel_diff_between",
    scenario_lookup,
    deviation_range=bin_largest_regression,
    limit=3,
    argus_layout=argus_layout,
    selected_chums=selected_chums_safety,
)

#### 3.1.4 Introspection Mode

In [None]:
# Introspect a single scenario
ntlj_introspection_mode(selected_chums_safety, meta_data_control, meta_data_candidate, scenario_lookup,
                        preset="safety", argus_layout=argus_layout)

### 3.2 Comparison of Comfort Metrics

In [None]:
selected_chums_comfort = []
create_metric_aggregation_dataframe(configs["comfort"], meta_data_control, meta_data_candidate)

#### 3.2.1 Longitudinal Acceleration

In [None]:
# Deviation in longitudinal acceleration
config = [get_config(configs_all, "longitudinal_acceleration_unweighted_sum")]
html, long_acc_bins = vis_ntlj_histograms(config, meta_data_control, meta_data_candidate, subplots=True)
html

In [None]:
# Scenarios in a specific bar in the histogram of deviations in longitudinal acceleration
bin_num = "highest" # valid values are 1-9, "lowest", and "highest"
bin_largest_long_acc_regression = get_histogram_bin_bounds(long_acc_bins, "longitudinal_acceleration_unweighted_sum", bin_num)
compare_select_scenarios(
    "longitudinal_acceleration_unweighted_sum",
    meta_data_control,
    meta_data_candidate,
    [vis_time_vs_longitudinal_acceleration],
    "diff_between",
    scenario_lookup,
    deviation_range=bin_largest_long_acc_regression,
    limit=3,
    argus_layout=argus_layout,
    selected_chums=selected_chums_comfort,
)

#### 3.2.2 Longitudinal Jerk

In [None]:
# Deviation in longitudinal jerk
config = [get_config(configs_all, "longitudinal_jerk_unweighted_sum")]
html, long_jerk_bins = vis_ntlj_histograms(config, meta_data_control, meta_data_candidate, subplots=True)
html

In [None]:
# Scenarios in a specific bar in the histogram of deviations in longitudinal jerk
bin_num = "highest" # valid values are 1-9, "lowest", and "highest"
bin_largest_long_jerk_regression = get_histogram_bin_bounds(long_jerk_bins, "longitudinal_jerk_unweighted_sum", bin_num)
compare_select_scenarios(
    "longitudinal_jerk_unweighted_sum",
    meta_data_control,
    meta_data_candidate,
    [vis_time_vs_longitudinal_jerk],
    "diff_between",
    scenario_lookup,
    deviation_range=bin_largest_long_jerk_regression,
    limit=3,
    argus_layout=argus_layout,
    selected_chums=selected_chums_comfort,
)

#### 3.2.3 Introspection Mode

In [None]:
# Introspect a single scenario
ntlj_introspection_mode(selected_chums_comfort, meta_data_control, meta_data_candidate, scenario_lookup,
                        preset="comfort", argus_layout=argus_layout)

### 3.3 Comparison of Progress Metrics

In [None]:
create_metric_aggregation_dataframe(configs["progress"], meta_data_control, meta_data_candidate)

In [None]:
# Deviation in progress metrics
html, progress_metrics_bins = vis_ntlj_histograms(configs["progress"], meta_data_control, meta_data_candidate,
                                                  subplots=True)
html

In [None]:
# Scenarios in a specific bar in the histogram of deviations in time to clear junction
bin_num = "highest" # valid values are 1-9, "lowest", and "highest"
bin_largest_junction_clear_regression = get_histogram_bin_bounds(progress_metrics_bins, "junction_clear_s", bin_num)
selected_chums_progress = []
compare_select_scenarios(
    "junction_clear_s",
    meta_data_control,
    meta_data_candidate,
    [vis_time_vs_distance],
    "diff_between",
    scenario_lookup,
    deviation_range=bin_largest_junction_clear_regression,
    limit=3,
    argus_layout=argus_layout,
    selected_chums=selected_chums_progress,
)

In [None]:
# Introspect a single scenario
ntlj_introspection_mode(selected_chums_progress, meta_data_control, meta_data_candidate, scenario_lookup,
                        preset="progress", argus_layout=argus_layout)

## 4. Comparison of Behaviors

### 4.1 Comparison of average ticks to stable yield

In [None]:
create_metric_aggregation_dataframe(configs["ticks_to_stable_yield"], meta_data_control, meta_data_candidate)

In [None]:
# Deviation in average ticks to stable yield
html, metrics_bins_ttsy = vis_ntlj_histograms(configs["ticks_to_stable_yield"], meta_data_control, meta_data_candidate,
                                              subplots=True)
html

In [None]:
# Scenarios in a specific bar in the histogram of deviations in average ticks to stable yield
bin_num = "highest" # valid values are 1-9, "lowest", and "highest"
bin_largest_ttsy_regression = get_histogram_bin_bounds(metrics_bins_ttsy, "junction_avg_ticks_to_stable_yield", bin_num)
selected_chums_ttsy = []
compare_select_scenarios(
    "junction_avg_ticks_to_stable_yield",
    meta_data_control,
    meta_data_candidate,
    [vis_time_vs_distance],
    "diff_between",
    scenario_lookup,
    deviation_range=bin_largest_ttsy_regression,
    limit=3,
    argus_layout=argus_layout,
    selected_chums=selected_chums_ttsy,
)

In [None]:
# Introspect a single scenario
ntlj_introspection_mode(selected_chums_ttsy, meta_data_control, meta_data_candidate, scenario_lookup,
                        argus_layout=argus_layout)

### 4.2 Comparison of max flicker count

In [None]:
create_metric_aggregation_dataframe(configs["max_flicker"], meta_data_control, meta_data_candidate)

In [None]:
# Deviation in max flicker count
html, metrics_bins_max_flicker = vis_ntlj_histograms(configs["max_flicker"], meta_data_control, meta_data_candidate,
                                                     subplots=True)
html

In [None]:
# Scenarios in a specific bar in the histogram of deviations in max flicker count
bin_num = "highest" # valid values are 1-9, "lowest", and "highest"
bin_largest_max_flicker_regression = get_histogram_bin_bounds(metrics_bins_max_flicker, "junction_max_flicker_count", bin_num)
selected_chums_max_flicker = []
compare_select_scenarios(
    "junction_max_flicker_count",
    meta_data_control,
    meta_data_candidate,
    [vis_time_vs_distance],
    "diff_between",
    scenario_lookup,
    deviation_range=bin_largest_max_flicker_regression,
    limit=3,
    argus_layout=argus_layout,
    selected_chums=selected_chums_max_flicker,
)

In [None]:
# Introspect a single scenario
ntlj_introspection_mode(selected_chums_max_flicker, meta_data_control, meta_data_candidate, scenario_lookup,
                        argus_layout=argus_layout)

### 4.3 Comparison of straight probability

In [None]:
create_metric_aggregation_dataframe(configs["straight_probability"], meta_data_control, meta_data_candidate)

In [None]:
# Deviation in straight probability
html, straight_prob_metrics_bins = vis_ntlj_histograms(configs["straight_probability"], meta_data_control,
                                                       meta_data_candidate, subplots=True)
html

In [None]:
# Scenarios in a specific bar in the histogram of deviations in max flicker count
bin_num = "highest" # valid values are 1-9, "lowest", and "highest"
bin_largest_straight_prob_regression = get_histogram_bin_bounds(straight_prob_metrics_bins, "straight_probability", bin_num)
selected_chums_straight_prob = []
compare_select_scenarios(
    "straight_probability",
    meta_data_control,
    meta_data_candidate,
    [vis_time_vs_distance],
    "rel_diff_between",
    scenario_lookup,
    deviation_range=bin_largest_straight_prob_regression,
    limit=3,
    argus_layout=argus_layout,
    selected_chums=selected_chums_straight_prob,
)

In [None]:
# Introspect a single scenario
ntlj_introspection_mode(selected_chums_straight_prob, meta_data_control, meta_data_candidate, scenario_lookup,
                        argus_layout=argus_layout)

## 5. Validations for ZRSS Metrics (For ZRSS developers only)

In [None]:
print("ZRSS metric version for control branch: {}".format(meta_data_control[0]["zrss_version"]))
print("ZRSS metric version for candidate branch: {}".format(meta_data_candidate[0]["zrss_version"]))
selected_chums_zrss = []
create_metric_aggregation_dataframe(configs["zrss"], meta_data_control, meta_data_candidate)

### 5.1 Filtered Time to Collision

In [None]:
# Deviation in filtered time to collision
config = [get_config(configs_all, "filtered_min_time_to_collision")]
html, filtered_ttc_metrics_bins = vis_ntlj_histograms(config, meta_data_control, meta_data_candidate, subplots=True)
html

In [None]:
# Scenarios in a specific bar in the histogram of deviations in filtered time to collision
bin_num = "lowest" # valid values are 1-9, "lowest", and "highest"
bin_largest_regression = get_histogram_bin_bounds(filtered_ttc_metrics_bins, "filtered_min_time_to_collision", bin_num)
compare_select_scenarios(
    "filtered_min_time_to_collision",
    meta_data_control,
    meta_data_candidate,
    [vis_time_vs_ttc, vis_time_vs_filtered_ttc, vis_time_vs_responsibility],
    "rel_diff_between",
    scenario_lookup,
    deviation_range=bin_largest_regression,
    limit=3,
    argus_layout=argus_layout,
    selected_chums=selected_chums_zrss,
)

### 5.2 RSS Time to Collision

In [None]:
# Deviation in RSS time to collision
config = [get_config(configs_all, "rss_min_time_to_collision")]
html, rss_ttc_metrics_bins = vis_ntlj_histograms(config, meta_data_control, meta_data_candidate, subplots=True)
html

In [None]:
# Scenarios in a specific bar in the histogram of deviations in RSS time to collision
bin_num = "lowest" # valid values are 1-9, "lowest", and "highest"
bin_largest_regression = get_histogram_bin_bounds(rss_ttc_metrics_bins, "rss_min_time_to_collision", bin_num)
compare_select_scenarios(
    "rss_min_time_to_collision",
    meta_data_control,
    meta_data_candidate,
    [vis_time_vs_ttc, vis_time_vs_rss_ttc, vis_time_vs_responsibility],
    "rel_diff_between",
    scenario_lookup,
    deviation_range=bin_largest_regression,
    limit=3,
    argus_layout=argus_layout,
    selected_chums=selected_chums_zrss,
)

### 5.3 Severity

In [None]:
# Deviation in severity
severity_df, severity_uris = vis_severity_transition_matrix(meta_data_control, meta_data_candidate)
severity_df

In [None]:
# Introspect a single scenario
ntlj_severity_introspection_mode(severity_uris, meta_data_control, meta_data_candidate, scenario_lookup,
                                 argus_layout=argus_layout)

## 6. Problematic Scenarios

### 6.1 Simulation Failures

In [None]:
get_scenarios_where_sim_failed(exp_data_control, exp_data_candidate)

### 6.2 Not Completed Autonomously

In [None]:
get_scenarios_not_completed_autonomously(meta_data_control, meta_data_candidate)

## 7. Metrics Conclusion

### [Metrics Conclusion Documentation](https://docs.google.com/document/d/17vzp8GRx-dE-NVZCUQnsomrBP9I0dFbAP6roCvIfgYI/)

In [None]:
# Deviation in overall seriousness
df_overall = compute_overall_dataframe(meta_data_control, meta_data_candidate, scenario_lookup)
vis_seriousness(df_overall)

In [None]:
# Histograms of delta scores
vis_delta_scores(df_overall)

In [None]:
# Seriousness scores
vis_color_maps(df_overall)

In [None]:
# Introspect a single scenario
selected_chums_seriousness = get_scenarios_by_seriousness(df_overall, 5, lowest=True)
ntlj_conclusion_introspection_mode(selected_chums_seriousness, meta_data_control, meta_data_candidate, df_overall,
                                   scenario_lookup, argus_layout=argus_layout)