Chapter 4 User Guide
This chapter provides detailed guidance on running simulations, configuring parameters, and interpreting results.
4.1 Running Your First Simulation
4.2 Simulation Parameters
4.2.1 Overview of All Parameters
Parameter | Type | Default | Description |
---|---|---|---|
identity_categories |
character vector | c(“A”,“B”,“C”,“D”,“E”) | Possible identity categories |
growth_rate |
numeric | 0.01 | Proportion to hire each cycle |
hiring_frequency |
integer | 12 | Steps between hiring cycles |
selection_criteria |
character | “conscientiousness” | How to select hires |
n_interactions_per_step |
integer | 5 | Interactions per agent per step |
interaction_window |
integer | 10 | Steps to consider for satisfaction |
turnover_threshold |
numeric | -10 | Satisfaction threshold for leaving |
turnover_type |
character | “threshold” | Type of turnover model |
base_turnover_rate |
numeric | 0.05 | Base probability of leaving |
n_new_applicants |
integer | 50 | New applicants per hiring cycle |
applicant_attraction_threshold |
numeric | -0.5 | Min attraction to stay in pool |
max_application_time |
integer | 12 | Steps before application expires |
4.2.2 Detailed Parameter Guide
4.2.2.1 Identity Categories
Controls the types of identities agents can have:
# Default categories (alphabetical labels)
params <- list(identity_categories = c("A", "B", "C", "D", "E"))
# Custom categories example (e.g., departments) - this is a customization
# Note: The default system uses alphabetical labels (A-E)
params <- list(identity_categories = c("Engineering", "Sales",
"Marketing", "Operations"))
4.2.2.2 Growth and Hiring
Configure organizational growth:
params <- list(
growth_rate = 0.02, # 2% growth per cycle
hiring_frequency = 4, # Hire every 4 steps
n_new_applicants = 100, # Large applicant pool
selection_criteria = "fit" # Select based on fit
)
Selection criteria options:
- "conscientiousness"
: Highest conscientiousness scores
- "fit"
: Best person-organization fit
- "random"
: Random selection (baseline)
4.3 Common Simulation Scenarios
4.3.1 Scenario 1: High-Growth Startup
startup_params <- list(
growth_rate = 0.10, # 10% growth per month
hiring_frequency = 4, # Weekly hiring
selection_criteria = "fit", # Culture fit important
turnover_threshold = -3, # Low tolerance for dissatisfaction
n_new_applicants = 200 # Large applicant pool
)
results <- run_asa_simulation(
n_steps = 260,
initial_size = 20,
params = startup_params
)
4.3.2 Scenario 2: Stable Corporation
corp_params <- list(
growth_rate = 0.005, # 0.5% growth per quarter
hiring_frequency = 12, # Monthly hiring
selection_criteria = "conscientiousness",
turnover_type = "probabilistic",
base_turnover_rate = 0.02 # 2% monthly turnover
)
results <- run_asa_simulation(
n_steps = 520,
initial_size = 500,
params = corp_params
)
4.4 Analyzing Results
4.4.1 Time Series Analysis
library(ggplot2)
library(dplyr)
# Calculate moving averages
results$metrics %>%
mutate(
ma_satisfaction = zoo::rollmean(avg_satisfaction, 10, fill = NA),
ma_size = zoo::rollmean(size, 10, fill = NA)
) %>%
ggplot(aes(x = time)) +
geom_line(aes(y = avg_satisfaction), alpha = 0.3) +
geom_line(aes(y = ma_satisfaction), color = "blue", size = 1)
4.4.2 Identity Dynamics
# Extract identity proportions over time
identity_props <- results$organization_snapshots %>%
lapply(function(snapshot) {
snapshot[is_active == TRUE, .N, by = identity_category] %>%
mutate(prop = N / sum(N),
time = snapshot$time[1])
}) %>%
bind_rows()
# Plot identity evolution
ggplot(identity_props, aes(x = time, y = prop, color = identity_category)) +
geom_line(size = 1) +
labs(title = "Identity Category Evolution",
y = "Proportion")
4.4.3 Turnover Analysis
# Calculate turnover rates
turnover_analysis <- results$metrics %>%
mutate(
period = floor(time / 12), # Monthly periods
employees_start = lag(size, default = 100)
) %>%
group_by(period) %>%
summarise(
turnover_count = sum(employees_start - size + lag(size - employees_start)),
avg_size = mean(size),
turnover_rate = turnover_count / avg_size
)
4.6 Batch Simulations
4.6.1 Parameter Sweeps
# Define parameter grid
param_grid <- expand.grid(
growth_rate = c(0.01, 0.02, 0.05),
turnover_threshold = c(-10, -5, -2),
selection_criteria = c("conscientiousness", "fit", "random")
)
# Run simulations
all_results <- list()
for(i in 1:nrow(param_grid)) {
params <- as.list(param_grid[i,])
results <- run_asa_simulation(
n_steps = 260,
initial_size = 100,
params = params,
verbose = FALSE
)
all_results[[i]] <- results$metrics %>%
mutate(
growth_rate = params$growth_rate,
turnover_threshold = params$turnover_threshold,
selection_criteria = params$selection_criteria,
run_id = i
)
}
# Combine results
combined_results <- bind_rows(all_results)
4.6.2 Replication Studies
# Run multiple replications
n_replications <- 10
replications <- list()
for(rep in 1:n_replications) {
set.seed(rep) # Different random seed
results <- run_asa_simulation(
n_steps = 260,
initial_size = 100,
params = my_params
)
replications[[rep]] <- results$metrics %>%
mutate(replication = rep)
}
# Analyze variance across replications
bind_rows(replications) %>%
group_by(time) %>%
summarise(
mean_size = mean(size),
sd_size = sd(size),
mean_satisfaction = mean(avg_satisfaction),
sd_satisfaction = sd(avg_satisfaction)
)
4.8 Troubleshooting
4.8.1 Common Issues
No hiring occurring:
- Check growth_rate
> 0
- Verify hiring_frequency
aligns with n_steps
- Ensure applicant pool is not empty
Rapid organization collapse:
- Increase turnover_threshold
(less negative)
- Reduce base_turnover_rate
- Check satisfaction calculations
Unrealistic homogenization:
- Increase n_interactions_per_step
- Use selection_criteria = "random"
- Verify diversity preferences
4.9 Metrics Deep Dive
Understanding the metrics output is crucial for interpreting simulation results. This section provides detailed explanations of each metric, their calculations, and what they reveal about organizational dynamics.
4.9.1 Overview of Metrics
The simulation tracks over 20 metrics at each time step, grouped into several categories:
- Organizational Composition: Size and identity distribution
- Diversity Indices: Multiple measures of heterogeneity
- Personality Distributions: Big Five trait statistics
- Satisfaction Metrics: Employee well-being indicators
4.9.2 Identity and Diversity Metrics
4.9.2.1 Blau’s Index (Default)
- Range: 0 (homogeneous) to 0.8 (maximum diversity with 5 categories)
- Interpretation: Probability two randomly selected employees differ in identity
- Why it matters: Standard I-O psychology metric for categorical diversity
- Example: 0.75 indicates high diversity; 0.25 indicates one dominant group
4.9.2.2 Shannon Entropy
- Range: 0 (homogeneous) to log(5) ≈ 1.61 (equal distribution)
- Interpretation: Information-theoretic measure of uncertainty
- Why it matters: More sensitive to rare categories than Blau’s
- Example: 1.5 indicates near-equal distribution; 0.5 indicates strong dominance
4.9.2.3 Category Proportions (prop_A through prop_E)
- Range: 0 to 1 for each category
- Interpretation: Fraction of employees in each identity category
- Why it matters: Direct view of organizational composition
- Patterns to watch:
- Gradual drift toward homogeneity
- Sudden shifts after mass turnover
- Equilibrium distributions
4.9.3 Personality Trait Metrics
For each Big Five trait, the simulation tracks:
4.9.3.1 Average Values (avg_openness, etc.)
- Range: 0 to 1
- Interpretation: Mean trait level in the organization
- Organizational implications:
- Openness: Innovation potential, change readiness
- Conscientiousness: Reliability, performance orientation
- Extraversion: Communication patterns, collaboration
- Agreeableness: Conflict levels, team cohesion
- Emotional Stability: Stress resistance, turnover risk
4.9.4 Satisfaction Metrics
4.9.5 Interpreting Metric Interactions
4.9.5.1 The Diversity-Satisfaction Paradox
- High diversity often correlates with lower initial satisfaction
- Homophily preferences drive this relationship
- Long-term benefits may offset short-term costs
4.9.6 Advanced Metric Analysis
4.9.7 Visualization Best Practices
4.9.7.1 Multi-Metric Dashboard
library(ggplot2)
library(patchwork)
# Standardize metrics for comparison
results$metrics[, `:=`(
std_diversity = scale(blau_index),
std_satisfaction = scale(avg_satisfaction),
std_size = scale(organization_size)
)]
# Create aligned time series
p_combined <- ggplot(results$metrics, aes(x = time_step)) +
geom_line(aes(y = std_diversity, color = "Diversity")) +
geom_line(aes(y = std_satisfaction, color = "Satisfaction")) +
geom_line(aes(y = std_size, color = "Size")) +
scale_color_manual(values = c("Diversity" = "purple",
"Satisfaction" = "green",
"Size" = "blue")) +
labs(title = "Standardized Organizational Metrics",
y = "Standardized Value (z-score)",
color = "Metric") +
theme_minimal()
4.9.8 Common Misinterpretations to Avoid
- Correlation ≠ Causation: High diversity causing low satisfaction may be mediated by homophily preferences
- Snapshot Bias: Single time points miss dynamics - always examine trajectories
- Scale Sensitivity: Raw values less meaningful than trends and relative changes
- Metric Gaming: Optimizing one metric often degrades others
- Initial Condition Dependence: Early randomness can have lasting effects
4.9.9 Using Metrics for Model Validation
Compare simulation metrics to empirical data:
# Example validation checks
empirical_turnover_rate <- 0.15 # Annual
simulated_annual_turnover <- mean(diff(results$metrics$organization_size[seq(1, 260, 52)]) < 0)
empirical_diversity <- 0.65 # Blau's index from survey
simulated_diversity_range <- range(results$metrics$blau_index)
# Check if empirical values fall within simulated ranges
validation_passed <- empirical_diversity >= simulated_diversity_range[1] &
empirical_diversity <= simulated_diversity_range[2]