Chapter 4 User Guide

This chapter provides detailed guidance on running simulations, configuring parameters, and interpreting results.

4.1 Running Your First Simulation

4.1.1 Basic Simulation

The simplest way to run a simulation uses all default parameters:

# Load the simulation engine
source("simulation/engine.R")

# Run with defaults
results <- run_asa_simulation()

4.1.2 Understanding the Output

The simulation returns a list with four components:

results$final_organization  # Final state data.table
results$metrics            # Time series metrics
results$parameters         # Parameters used
results$organization_snapshots  # Periodic snapshots

4.2 Simulation Parameters

4.2.1 Overview of All Parameters

Parameter	Type	Default	Description
`identity_categories`	character vector	c(“A”,“B”,“C”,“D”,“E”)	Possible identity categories
`growth_rate`	numeric	0.01	Proportion to hire each cycle
`hiring_frequency`	integer	12	Steps between hiring cycles
`selection_criteria`	character	“conscientiousness”	How to select hires
`n_interactions_per_step`	integer	5	Interactions per agent per step
`interaction_window`	integer	10	Steps to consider for satisfaction
`turnover_threshold`	numeric	-10	Satisfaction threshold for leaving
`turnover_type`	character	“threshold”	Type of turnover model
`base_turnover_rate`	numeric	0.05	Base probability of leaving
`n_new_applicants`	integer	50	New applicants per hiring cycle
`applicant_attraction_threshold`	numeric	-0.5	Min attraction to stay in pool
`max_application_time`	integer	12	Steps before application expires

4.2.2 Detailed Parameter Guide

4.2.2.1 Identity Categories

Controls the types of identities agents can have:

# Default categories (alphabetical labels)
params <- list(identity_categories = c("A", "B", "C", "D", "E"))

# Custom categories example (e.g., departments) - this is a customization
# Note: The default system uses alphabetical labels (A-E)
params <- list(identity_categories = c("Engineering", "Sales", 
                                      "Marketing", "Operations"))

4.2.2.2 Growth and Hiring

Configure organizational growth:

params <- list(
  growth_rate = 0.02,        # 2% growth per cycle
  hiring_frequency = 4,      # Hire every 4 steps
  n_new_applicants = 100,    # Large applicant pool
  selection_criteria = "fit" # Select based on fit
)

Selection criteria options: - "conscientiousness": Highest conscientiousness scores - "fit": Best person-organization fit - "random": Random selection (baseline)

4.2.2.3 Interaction Settings

Control how agents interact:

params <- list(
  n_interactions_per_step = 10,  # More interactions
  interaction_window = 20        # Longer memory
)

4.2.2.4 Turnover Configuration

Two turnover models available:

Threshold Model:

params <- list(
  turnover_type = "threshold",
  turnover_threshold = -5  # Leave if satisfaction < -5
)

Probabilistic Model:

params <- list(
  turnover_type = "probabilistic",
  base_turnover_rate = 0.10  # 10% base turnover
)

4.3 Common Simulation Scenarios

4.3.1 Scenario 1: High-Growth Startup

startup_params <- list(
  growth_rate = 0.10,           # 10% growth per month
  hiring_frequency = 4,         # Weekly hiring
  selection_criteria = "fit",   # Culture fit important
  turnover_threshold = -3,      # Low tolerance for dissatisfaction
  n_new_applicants = 200        # Large applicant pool
)

results <- run_asa_simulation(
  n_steps = 260,
  initial_size = 20,
  params = startup_params
)

4.3.2 Scenario 2: Stable Corporation

corp_params <- list(
  growth_rate = 0.005,          # 0.5% growth per quarter
  hiring_frequency = 12,        # Monthly hiring
  selection_criteria = "conscientiousness",
  turnover_type = "probabilistic",
  base_turnover_rate = 0.02    # 2% monthly turnover
)

results <- run_asa_simulation(
  n_steps = 520,
  initial_size = 500,
  params = corp_params
)

4.3.3 Scenario 3: Diversity-Focused Organization

diversity_params <- list(
  growth_rate = 0.02,
  selection_criteria = "random",  # Reduce selection bias
  n_interactions_per_step = 20,   # Increase mixing
  interaction_window = 30         # Longer relationship building
)

# Also modify agent preferences
# (Requires custom initialization - see Developer Guide)

4.4 Analyzing Results

4.4.1 Time Series Analysis

library(ggplot2)
library(dplyr)

# Calculate moving averages
results$metrics %>%
  mutate(
    ma_satisfaction = zoo::rollmean(avg_satisfaction, 10, fill = NA),
    ma_size = zoo::rollmean(size, 10, fill = NA)
  ) %>%
  ggplot(aes(x = time)) +
  geom_line(aes(y = avg_satisfaction), alpha = 0.3) +
  geom_line(aes(y = ma_satisfaction), color = "blue", size = 1)

4.4.2 Identity Dynamics

# Extract identity proportions over time
identity_props <- results$organization_snapshots %>%
  lapply(function(snapshot) {
    snapshot[is_active == TRUE, .N, by = identity_category] %>%
      mutate(prop = N / sum(N), 
             time = snapshot$time[1])
  }) %>%
  bind_rows()

# Plot identity evolution
ggplot(identity_props, aes(x = time, y = prop, color = identity_category)) +
  geom_line(size = 1) +
  labs(title = "Identity Category Evolution",
       y = "Proportion")

4.4.3 Turnover Analysis

# Calculate turnover rates
turnover_analysis <- results$metrics %>%
  mutate(
    period = floor(time / 12),  # Monthly periods
    employees_start = lag(size, default = 100)
  ) %>%
  group_by(period) %>%
  summarise(
    turnover_count = sum(employees_start - size + lag(size - employees_start)),
    avg_size = mean(size),
    turnover_rate = turnover_count / avg_size
  )

4.5 Saving and Loading Results

4.5.1 Saving Simulation Output

# Save with automatic file naming
save_simulation_results(results, "my_simulation")

# Creates:
# - my_simulation_metrics.csv
# - my_simulation_params.rds
# - my_simulation_final_org.csv
# - my_simulation_snapshots.rds (if requested)

4.5.2 Loading Previous Results

# Load saved results
metrics <- fread("my_simulation_metrics.csv")
params <- readRDS("my_simulation_params.rds")
final_org <- fread("my_simulation_final_org.csv")

# Recreate results object
results <- list(
  metrics = metrics,
  parameters = params,
  final_organization = final_org
)

4.6 Batch Simulations

4.6.1 Parameter Sweeps

# Define parameter grid
param_grid <- expand.grid(
  growth_rate = c(0.01, 0.02, 0.05),
  turnover_threshold = c(-10, -5, -2),
  selection_criteria = c("conscientiousness", "fit", "random")
)

# Run simulations
all_results <- list()
for(i in 1:nrow(param_grid)) {
  params <- as.list(param_grid[i,])
  
  results <- run_asa_simulation(
    n_steps = 260,
    initial_size = 100,
    params = params,
    verbose = FALSE
  )
  
  all_results[[i]] <- results$metrics %>%
    mutate(
      growth_rate = params$growth_rate,
      turnover_threshold = params$turnover_threshold,
      selection_criteria = params$selection_criteria,
      run_id = i
    )
}

# Combine results
combined_results <- bind_rows(all_results)

4.6.2 Replication Studies

# Run multiple replications
n_replications <- 10
replications <- list()

for(rep in 1:n_replications) {
  set.seed(rep)  # Different random seed
  
  results <- run_asa_simulation(
    n_steps = 260,
    initial_size = 100,
    params = my_params
  )
  
  replications[[rep]] <- results$metrics %>%
    mutate(replication = rep)
}

# Analyze variance across replications
bind_rows(replications) %>%
  group_by(time) %>%
  summarise(
    mean_size = mean(size),
    sd_size = sd(size),
    mean_satisfaction = mean(avg_satisfaction),
    sd_satisfaction = sd(avg_satisfaction)
  )

4.7 Performance Tips

4.7.1 Memory Management

# For large simulations, reduce snapshot frequency
results <- run_asa_simulation(
  n_steps = 1000,
  initial_size = 5000,
  params = list(
    snapshot_frequency = 50  # Only save every 50 steps
  )
)

# Clear memory between runs
rm(results)
gc()

4.7.2 Speed Optimization

# Reduce interaction frequency for faster runs
fast_params <- list(
  n_interactions_per_step = 2,  # Fewer interactions
  interaction_window = 5        # Shorter memory
)

# Profile simulation performance
library(profvis)
profvis({
  results <- run_asa_simulation(n_steps = 100)
})

4.8 Troubleshooting

4.8.1 Common Issues

No hiring occurring: - Check growth_rate > 0 - Verify hiring_frequency aligns with n_steps - Ensure applicant pool is not empty

Rapid organization collapse: - Increase turnover_threshold (less negative) - Reduce base_turnover_rate - Check satisfaction calculations

Unrealistic homogenization: - Increase n_interactions_per_step - Use selection_criteria = "random" - Verify diversity preferences

4.8.2 Debugging Tools

# Enable detailed logging
debug_results <- run_asa_simulation(
  n_steps = 20,
  initial_size = 10,
  verbose = TRUE,
  params = list(debug = TRUE)
)

# Inspect specific time points
time_10 <- results$organization_snapshots[[1]]
summary(time_10)

4.9 Metrics Deep Dive

Understanding the metrics output is crucial for interpreting simulation results. This section provides detailed explanations of each metric, their calculations, and what they reveal about organizational dynamics.

4.9.1 Overview of Metrics

The simulation tracks over 20 metrics at each time step, grouped into several categories:

Organizational Composition: Size and identity distribution
Diversity Indices: Multiple measures of heterogeneity
Personality Distributions: Big Five trait statistics
Satisfaction Metrics: Employee well-being indicators

4.9.2 Identity and Diversity Metrics

4.9.2.1 Blau’s Index (Default)

# Formula: 1 - Σ(p_i^2)
# Where p_i is the proportion of category i

Range: 0 (homogeneous) to 0.8 (maximum diversity with 5 categories)
Interpretation: Probability two randomly selected employees differ in identity
Why it matters: Standard I-O psychology metric for categorical diversity
Example: 0.75 indicates high diversity; 0.25 indicates one dominant group

4.9.2.2 Shannon Entropy

# Formula: -Σ(p_i * log(p_i))
# Where p_i is the proportion of category i

Range: 0 (homogeneous) to log(5) ≈ 1.61 (equal distribution)
Interpretation: Information-theoretic measure of uncertainty
Why it matters: More sensitive to rare categories than Blau’s
Example: 1.5 indicates near-equal distribution; 0.5 indicates strong dominance

4.9.2.3 Category Proportions (prop_A through prop_E)

Range: 0 to 1 for each category
Interpretation: Fraction of employees in each identity category
Why it matters: Direct view of organizational composition
Patterns to watch:
- Gradual drift toward homogeneity
- Sudden shifts after mass turnover
- Equilibrium distributions

4.9.3 Personality Trait Metrics

For each Big Five trait, the simulation tracks:

4.9.3.1 Average Values (avg_openness, etc.)

Range: 0 to 1
Interpretation: Mean trait level in the organization
Organizational implications:
- Openness: Innovation potential, change readiness
- Conscientiousness: Reliability, performance orientation
- Extraversion: Communication patterns, collaboration
- Agreeableness: Conflict levels, team cohesion
- Emotional Stability: Stress resistance, turnover risk

4.9.3.2 Standard Deviations (sd_openness, etc.)

Range: 0 to ~0.5 (theoretical max)
Interpretation: Trait heterogeneity in the organization
Why it matters:
- Low SD indicates cultural convergence
- High SD suggests diverse perspectives
- Zero SD means complete homogenization

4.9.4 Satisfaction Metrics

4.9.4.1 Average Satisfaction (avg_satisfaction)

Range: Typically -20 to +20
Interpretation: Overall employee well-being
Key thresholds:
- Above 0: Generally positive environment
- Below -5: Risk of increased turnover
- Below -10: Crisis level (default turnover threshold)

4.9.4.2 Satisfaction Standard Deviation (sd_satisfaction)

Range: 0 to ~10
Interpretation: Variation in employee experiences
Warning signs:
- High SD with low average: Polarized organization
- Increasing SD: Growing disparities
- Very low SD: Possible groupthink

4.9.5 Interpreting Metric Interactions

4.9.5.1 The Diversity-Satisfaction Paradox

# Common pattern
plot(results$metrics$blau_index, results$metrics$avg_satisfaction)

High diversity often correlates with lower initial satisfaction
Homophily preferences drive this relationship
Long-term benefits may offset short-term costs

4.9.5.2 Personality Convergence Cascade

# Track convergence
convergence_rate <- diff(results$metrics$sd_conscientiousness)

Selection on one trait affects all traits
Convergence accelerates over time
Can lead to organizational blindspots

4.9.5.3 Turnover Spirals

# Identify spiral onset
turnover_indicator <- results$metrics$organization_size < 
                     lag(results$metrics$organization_size)

Low satisfaction → turnover → lower diversity → lower satisfaction
Critical to catch early
Intervention points: hiring strategy, satisfaction boost

4.9.6 Advanced Metric Analysis

4.9.6.1 Creating Composite Indices

# Organizational Health Index
results$metrics[, health_index := 
  0.3 * (avg_satisfaction + 10) / 20 +  # Normalized satisfaction
  0.3 * blau_index +                      # Diversity
  0.2 * (organization_size / initial_size) +  # Growth
  0.2 * (1 - sd_satisfaction / 10)       # Cohesion
]

4.9.6.2 Detecting Phase Transitions

# Find inflection points
library(changepoint)
cpt_diversity <- cpt.mean(results$metrics$blau_index)
plot(cpt_diversity)

4.9.6.3 Metric Stability Analysis

# Rolling window stability
window <- 26  # Half year
results$metrics[, `:=`(
  diversity_stability = frollapply(blau_index, window, sd),
  satisfaction_stability = frollapply(avg_satisfaction, window, sd)
)]

4.9.7 Visualization Best Practices

4.9.7.1 Multi-Metric Dashboard

library(ggplot2)
library(patchwork)

# Standardize metrics for comparison
results$metrics[, `:=`(
  std_diversity = scale(blau_index),
  std_satisfaction = scale(avg_satisfaction),
  std_size = scale(organization_size)
)]

# Create aligned time series
p_combined <- ggplot(results$metrics, aes(x = time_step)) +
  geom_line(aes(y = std_diversity, color = "Diversity")) +
  geom_line(aes(y = std_satisfaction, color = "Satisfaction")) +
  geom_line(aes(y = std_size, color = "Size")) +
  scale_color_manual(values = c("Diversity" = "purple", 
                               "Satisfaction" = "green",
                               "Size" = "blue")) +
  labs(title = "Standardized Organizational Metrics",
       y = "Standardized Value (z-score)",
       color = "Metric") +
  theme_minimal()

4.9.7.2 Phase Space Visualization

# 3D phase space
library(plotly)
plot_ly(results$metrics, 
        x = ~blau_index, 
        y = ~avg_satisfaction, 
        z = ~organization_size,
        type = "scatter3d",
        mode = "lines+markers",
        color = ~time_step,
        colors = "Viridis")

4.9.8 Common Misinterpretations to Avoid

Correlation ≠ Causation: High diversity causing low satisfaction may be mediated by homophily preferences
Snapshot Bias: Single time points miss dynamics - always examine trajectories
Scale Sensitivity: Raw values less meaningful than trends and relative changes
Metric Gaming: Optimizing one metric often degrades others
Initial Condition Dependence: Early randomness can have lasting effects

4.9.9 Using Metrics for Model Validation

Compare simulation metrics to empirical data:

# Example validation checks
empirical_turnover_rate <- 0.15  # Annual
simulated_annual_turnover <- mean(diff(results$metrics$organization_size[seq(1, 260, 52)]) < 0)

empirical_diversity <- 0.65  # Blau's index from survey
simulated_diversity_range <- range(results$metrics$blau_index)

# Check if empirical values fall within simulated ranges
validation_passed <- empirical_diversity >= simulated_diversity_range[1] & 
                    empirical_diversity <= simulated_diversity_range[2]