Chapter 9 Contributor’s Walkthrough

Welcome to the ASA ABM v2 contributor’s guide! This chapter provides a practical walkthrough for developers who want to contribute to or extend the codebase. Whether you’re adding new features, fixing bugs, or improving performance, this guide will help you navigate the codebase effectively.

9.1 Getting Started as a Contributor

9.1.1 Setting up the Development Environment

Clone the repository:

git clone [repository-url]
cd asa_abm_v2

Install R dependencies:

# Core dependencies
install.packages(c(
  "data.table",      # High-performance data manipulation
  "checkmate",       # Input validation
  "ggplot2",         # Visualization
  "plotly",          # Interactive plots
  "tidyverse",       # Data manipulation utilities
  "testthat",        # Testing framework
  "microbenchmark",  # Performance testing
  "profvis"          # Profiling tool
))

# Documentation dependencies
install.packages(c(
  "bookdown",
  "knitr",
  "rmarkdown"
))

Verify installation:

source("run_simulation.R")

# Run a quick test simulation
test_sim <- run_asa_simulation(
  n_steps = 10,
  initial_size = 20,
  verbose = TRUE
)

9.1.2 Understanding the Codebase Structure

The codebase follows a modular architecture:

asa_abm_v2/
├── core/                  # Core components
│   ├── agent.R           # Agent definitions
│   ├── organization.R    # Organization structure
│   └── interactions.R    # Social interactions
├── simulation/           # Simulation logic
│   ├── engine.R         # Main simulation loop
│   ├── hiring.R         # Hiring processes
│   └── turnover.R       # Turnover mechanisms
├── analysis/            # Analysis tools
├── tests/               # Test suite
├── config/              # Configuration files
└── docs/                # Documentation

9.1.3 Running Tests and Examples

# Run all tests
testthat::test_dir("tests/")

# Run specific test file
testthat::test_file("tests/test_agent.R")

# Run example simulations
source("docs/vignettes/basic_simulation.R")

9.2 Step-by-Step Feature Tutorial: Adding a New Personality Trait

Let’s walk through adding a new personality trait called “risk_tolerance” to the simulation. This tutorial shows exactly which files need modification and how changes flow through the system.

9.2.1 Step 1: Update Agent Definition

First, modify core/agent.R to include the new trait:

# In create_applicant_pool() function around line 30
create_applicant_pool <- function(n_applicants = 50,
                                identity_categories = c("A", "B", "C", "D", "E")) {
  
  # ... existing code ...
  
  # Create applicant pool
  applicants <- data.table(
    agent_id = applicant_ids,
    identity_category = sample(identity_categories, n_applicants, replace = TRUE),
    
    # Big Five personality traits
    openness = rnorm(n_applicants, mean = 0, sd = 1),
    conscientiousness = rnorm(n_applicants, mean = 0, sd = 1),
    extraversion = rnorm(n_applicants, mean = 0, sd = 1),
    agreeableness = rnorm(n_applicants, mean = 0, sd = 1),
    emotional_stability = rnorm(n_applicants, mean = 0, sd = 1),
    
    # NEW TRAIT: Risk tolerance
    risk_tolerance = rnorm(n_applicants, mean = 0, sd = 1),
    
    # Preferences
    diversity_preference = rnorm(n_applicants, mean = 0, sd = 1),
    homophily_preference = rnorm(n_applicants, mean = 0, sd = 1),
    
    # Application state
    attraction = 0,
    application_time = 0
  )
  
  # ... rest of function ...
}

# Also update convert_to_employee() function
convert_to_employee <- function(applicant, hire_time) {
  employee <- copy(applicant)
  employee[, `:=`(
    hire_time = hire_time,
    tenure = 0,
    satisfaction = 0,
    performance = calculate_initial_performance(applicant),
    risk_taking_behavior = 0  # NEW: Initialize behavior based on trait
  )]
  
  return(employee)
}

# NEW FUNCTION: Calculate risk-based performance modifier
calculate_risk_performance_modifier <- function(risk_tolerance) {
  # High risk tolerance can lead to bigger wins or losses
  risk_factor <- pnorm(risk_tolerance) # Convert to 0-1 scale
  modifier <- rnorm(1, mean = 0, sd = risk_factor * 0.2)
  return(modifier)
}

9.2.2 Step 2: Update Organization Initialization

Modify core/organization.R to handle the new trait:

# In initialize_organization() function
initialize_organization <- function(size = 100, 
                                  identity_categories = c("A", "B", "C", "D", "E"),
                                  params = list()) {
  
  # ... existing code ...
  
  # Create initial employees with new trait
  employees <- data.table(
    agent_id = agent_ids,
    identity_category = sample(identity_categories, size, replace = TRUE),
    
    # Personality traits
    openness = rnorm(size, mean = 0, sd = 1),
    conscientiousness = rnorm(size, mean = 0, sd = 1),
    extraversion = rnorm(size, mean = 0, sd = 1),
    agreeableness = rnorm(size, mean = 0, sd = 1),
    emotional_stability = rnorm(size, mean = 0, sd = 1),
    risk_tolerance = rnorm(size, mean = 0, sd = 1),  # NEW
    
    # ... rest of initialization ...
  )
  
  # ... rest of function ...
}

9.2.3 Step 3: Incorporate into Interactions

Update core/interactions.R to use the new trait:

# Add risk tolerance influence to satisfaction calculation
calculate_interaction_satisfaction <- function(agent1, agent2, org_culture = NULL) {
  
  # ... existing similarity calculations ...
  
  # NEW: Risk tolerance compatibility
  risk_compatibility <- 1 - abs(agent1$risk_tolerance - agent2$risk_tolerance) / 4
  
  # Update satisfaction calculation
  satisfaction <- mean(c(
    identity_match,
    personality_similarity,
    risk_compatibility  # NEW component
  ))
  
  # Apply cultural moderation if available
  if (!is.null(org_culture)) {
    culture_fit <- calculate_culture_fit(agent1, org_culture)
    
    # NEW: Risk-taking culture modifier
    if (org_culture$risk_taking_culture > 0) {
      risk_bonus <- agent1$risk_tolerance * org_culture$risk_taking_culture * 0.1
      satisfaction <- satisfaction + risk_bonus
    }
  }
  
  return(satisfaction)
}

9.2.4 Step 4: Update Selection Criteria

Modify simulation/hiring.R to allow selection based on risk tolerance:

# In evaluate_applicants() function
evaluate_applicants <- function(applicants, org_state, selection_criteria = "conscientiousness") {
  
  # ... existing code ...
  
  # Evaluate based on criteria
  if (selection_criteria == "conscientiousness") {
    applicants[, selection_score := conscientiousness]
  } else if (selection_criteria == "cultural_fit") {
    applicants[, selection_score := calculate_cultural_fit(.SD, org_culture), by = agent_id]
  } else if (selection_criteria == "risk_tolerance") {  # NEW
    # Organizations might want risk-takers or risk-averse employees
    target_risk <- org_state$params$target_risk_level %||% 0
    applicants[, selection_score := -abs(risk_tolerance - target_risk)]
  } else if (selection_criteria == "balanced") {
    # NEW: Include risk tolerance in balanced selection
    applicants[, selection_score := (
      conscientiousness * 0.3 +
      emotional_stability * 0.2 +
      agreeableness * 0.2 +
      openness * 0.15 +
      extraversion * 0.1 +
      risk_tolerance * org_state$params$risk_weight %||% 0.05
    )]
  }
  
  # ... rest of function ...
}

9.2.5 Step 5: Add Performance Impact

Update performance calculations in simulation/engine.R:

# Add new function for risk-based events
simulate_risk_events <- function(employees, org_state) {
  # High risk tolerance employees may generate volatile outcomes
  risk_takers <- employees[risk_tolerance > 1]
  
  if (nrow(risk_takers) > 0) {
    risk_takers[, risk_event := runif(.N) < pnorm(risk_tolerance) * 0.1]
    
    # Calculate performance impact
    risk_takers[risk_event == TRUE, performance_modifier := 
                calculate_risk_performance_modifier(risk_tolerance)]
    
    # Update organization metrics
    total_impact <- sum(risk_takers$performance_modifier, na.rm = TRUE)
    org_state$metrics$risk_impact <- c(org_state$metrics$risk_impact, total_impact)
  }
  
  return(employees)
}

# Integrate into main simulation loop
run_simulation_step <- function(org_state, step, params) {
  
  # ... existing step logic ...
  
  # NEW: Simulate risk events
  org_state$employees <- simulate_risk_events(org_state$employees, org_state)
  
  # ... rest of step ...
}

9.2.6 Step 6: Add Tests

Create tests/test_risk_tolerance.R:

library(testthat)
library(data.table)

# Source required files
source("../core/agent.R")
source("../core/organization.R")
source("../core/interactions.R")

test_that("Risk tolerance is properly initialized", {
  # Test applicant creation
  applicants <- create_applicant_pool(n_applicants = 100)
  
  expect_true("risk_tolerance" %in% names(applicants))
  expect_equal(length(applicants$risk_tolerance), 100)
  expect_true(all(!is.na(applicants$risk_tolerance)))
  
  # Test distribution
  expect_true(abs(mean(applicants$risk_tolerance)) < 0.2)
  expect_true(abs(sd(applicants$risk_tolerance) - 1) < 0.2)
})

test_that("Risk tolerance affects interactions", {
  # Create two agents with different risk tolerances
  agent1 <- data.table(
    agent_id = "test1",
    risk_tolerance = 2,
    openness = 0, conscientiousness = 0,
    extraversion = 0, agreeableness = 0,
    emotional_stability = 0,
    identity_category = "A"
  )
  
  agent2 <- data.table(
    agent_id = "test2", 
    risk_tolerance = -2,
    openness = 0, conscientiousness = 0,
    extraversion = 0, agreeableness = 0,
    emotional_stability = 0,
    identity_category = "A"
  )
  
  # Test interaction satisfaction
  satisfaction <- calculate_interaction_satisfaction(agent1, agent2)
  
  # Should be lower due to risk tolerance difference
  expect_true(satisfaction < 0.7)
})

test_that("Risk-based selection works correctly", {
  applicants <- create_applicant_pool(n_applicants = 50)
  
  org_state <- list(
    params = list(target_risk_level = 1.5),
    culture = list(risk_taking_culture = 0.5)
  )
  
  evaluated <- evaluate_applicants(
    applicants, 
    org_state, 
    selection_criteria = "risk_tolerance"
  )
  
  # Check that selection scores favor applicants close to target
  best_applicant <- evaluated[which.max(selection_score)]
  expect_true(abs(best_applicant$risk_tolerance - 1.5) < 1)
})

test_that("Risk performance modifier calculates correctly", {
  # Test multiple risk levels
  low_risk <- calculate_risk_performance_modifier(-2)
  high_risk <- calculate_risk_performance_modifier(2)
  
  # High risk should have higher variance
  set.seed(123)
  low_risk_mods <- replicate(1000, calculate_risk_performance_modifier(-2))
  high_risk_mods <- replicate(1000, calculate_risk_performance_modifier(2))
  
  expect_true(var(high_risk_mods) > var(low_risk_mods))
})

9.2.7 Step 7: Update Documentation

Add to docs/06-api-reference.Rmd:

### Agent Attributes

Agents in the simulation have the following attributes:

- **risk_tolerance**: Normal(0, 1) - Propensity for risk-taking behavior
  - Values > 1: Risk-seeking individuals
  - Values < -1: Risk-averse individuals
  - Affects performance variability and interaction satisfaction

9.3 Code Flow Walkthrough

9.3.1 Simulation Step Trace

Here’s a detailed trace through one simulation step:

# STEP 1: Main loop calls run_simulation_step()
run_asa_simulation()
  └── for (step in 1:n_steps)
        └── run_simulation_step(org_state, step, params)

# STEP 2: Check for hiring needs
run_simulation_step()
  ├── if (step %% params$hiring_frequency == 0)
  │     └── perform_hiring_round()
  │           ├── create_applicant_pool()
  │           ├── calculate_org_attractiveness()
  │           ├── evaluate_applicants()
  │           └── hire_applicants()
  │
  # STEP 3: Process interactions
  ├── simulate_interactions()
  │     ├── sample_interaction_pairs()
  │     ├── for each pair:
  │     │     └── calculate_interaction_satisfaction()
  │     └── update_satisfaction_history()
  │
  # STEP 4: Calculate turnover
  ├── process_turnover()
  │     ├── calculate_turnover_probability()
  │     ├── identify_leavers()
  │     └── remove_employees()
  │
  # STEP 5: Update metrics
  └── update_org_metrics()
        ├── calculate_diversity_index()
        ├── calculate_mean_satisfaction()
        ├── calculate_performance_metrics()
        └── store_step_history()

9.3.2 Data Flow Diagram

┌─────────────────┐     ┌──────────────────┐     ┌─────────────────┐
│ Applicant Pool  │────▶│ Organization     │────▶│ History/Metrics │
└─────────────────┘     └──────────────────┘     └─────────────────┘
        │                        │                         │
        │                        ▼                         │
        │               ┌──────────────────┐              │
        │               │    Employees     │              │
        │               └──────────────────┘              │
        │                        │                         │
        │                        ▼                         │
        │               ┌──────────────────┐              │
        └──────────────▶│  Interactions    │──────────────┘
                        └──────────────────┘

9.3.3 Key Functions and Responsibilities

Module	Function	Responsibility
`engine.R`	`run_asa_simulation()`	Main entry point, orchestrates simulation
`engine.R`	`run_simulation_step()`	Executes one time step
`agent.R`	`create_applicant_pool()`	Generates new applicants
`organization.R`	`initialize_organization()`	Sets up initial state
`hiring.R`	`perform_hiring_round()`	Manages hiring process
`interactions.R`	`simulate_interactions()`	Handles social dynamics
`turnover.R`	`process_turnover()`	Manages employee departures

9.4 Testing Philosophy

9.4.1 Types of Tests

Unit Tests: Test individual functions in isolation

test_that("Diversity index calculates correctly", {
  employees <- data.table(
    identity_category = c("A", "A", "B", "B", "C")
  )

  blau_index <- calculate_diversity_index(employees, method = "blau")
  expect_equal(blau_index, 0.64)  # 1 - (0.4^2 + 0.4^2 + 0.2^2)
})

Integration Tests: Test component interactions

test_that("Hiring process integrates correctly", {
  org_state <- initialize_organization(size = 10)
  initial_size <- nrow(org_state$employees)

  org_state <- perform_hiring_round(org_state, n_to_hire = 5)

  expect_equal(nrow(org_state$employees), initial_size + 5)
  expect_true(all(org_state$employees$hire_time >= 0))
})

Regression Tests: Ensure changes don’t break existing functionality

test_that("Simulation produces consistent results", {
  set.seed(12345)
  sim1 <- run_asa_simulation(n_steps = 10, initial_size = 20)

  set.seed(12345)
  sim2 <- run_asa_simulation(n_steps = 10, initial_size = 20)

  expect_identical(sim1$metrics, sim2$metrics)
})

9.4.2 Writing Good Tests

Follow the AAA pattern:

test_that("Employee satisfaction updates correctly", {
  # ARRANGE - Set up test data
  employee <- data.table(
    agent_id = "emp_001",
    satisfaction = 0,
    satisfaction_history = list(numeric(0))
  )
  
  # ACT - Execute the function
  updated_employee <- update_satisfaction(employee, new_satisfaction = 0.7)
  
  # ASSERT - Check results
  expect_equal(updated_employee$satisfaction, 0.7)
  expect_length(updated_employee$satisfaction_history[[1]], 1)
  expect_equal(updated_employee$satisfaction_history[[1]][1], 0.7)
})

9.4.3 Test Coverage Expectations

Core functions: 100% coverage required
Utility functions: 90% coverage minimum
Edge cases: Must test boundary conditions
Error handling: Test invalid inputs

Check coverage with:

library(covr)
cov <- package_coverage()
report(cov)

9.4.4 Running the Test Suite

# Run all tests with detailed output
testthat::test_dir("tests/", reporter = "progress")

# Run tests for a specific module
testthat::test_file("tests/test_hiring.R")

# Run tests matching a pattern
testthat::test_dir("tests/", filter = "diversity")

# Run with coverage report
covr::report(covr::package_coverage())

9.5 Development Best Practices

9.5.1 R Coding Standards

Function naming: Use descriptive verb_noun format

# Good
calculate_diversity_index()
update_employee_satisfaction()

# Avoid
divIndex()
empSatUpd()

Variable naming: Use snake_case for variables

# Good
employee_count <- nrow(employees)
mean_satisfaction <- mean(employees$satisfaction)

# Avoid
employeeCount <- nrow(employees)
meanSat <- mean(employees$satisfaction)

Function documentation: Use roxygen2 format

#' Calculate organization's cultural distance from target
#' 
#' @param org_culture Current organizational culture (list)
#' @param target_culture Target cultural values (list)
#' @param weights Optional weights for culture dimensions
#' @return Numeric distance score (0 = identical, higher = more different)
#' @examples
#' culture <- list(innovation = 0.7, stability = 0.3)
#' target <- list(innovation = 0.9, stability = 0.1)
#' calculate_cultural_distance(culture, target)
calculate_cultural_distance <- function(org_culture, target_culture, weights = NULL) {
  # Implementation
}

9.5.2 Using data.table Effectively

Reference semantics: Use := for in-place updates

# Efficient - modifies in place
employees[, satisfaction := satisfaction + 0.1]

# Inefficient - creates copy
employees$satisfaction <- employees$satisfaction + 0.1

Group operations: Use by for grouped calculations

# Calculate mean satisfaction by identity category
employees[, .(
  mean_satisfaction = mean(satisfaction),
  count = .N
), by = identity_category]

Keys for performance: Set keys on frequently joined columns

setkey(employees, agent_id)
setkey(interactions, agent1_id, agent2_id)

Chaining operations: Use data.table chaining

employees[
  tenure > 12
][
  , satisfaction_rank := rank(-satisfaction)
][
  satisfaction_rank <= 10
]

9.5.3 Performance Considerations

Preallocate memory:

# Good - preallocate
results <- vector("numeric", n_steps)
for (i in 1:n_steps) {
  results[i] <- calculate_metric(i)
}

# Avoid - growing vectors
results <- c()
for (i in 1:n_steps) {
  results <- c(results, calculate_metric(i))
}

Vectorize operations:

# Good - vectorized
employees[, performance := conscientiousness * 0.3 + 
                           emotional_stability * 0.2 + 
                           satisfaction * 0.5]

# Avoid - row-by-row
for (i in 1:nrow(employees)) {
  employees$performance[i] <- employees$conscientiousness[i] * 0.3 +
                              employees$emotional_stability[i] * 0.2 +
                              employees$satisfaction[i] * 0.5
}

Profile bottlenecks:

library(profvis)

profvis({
  sim_results <- run_asa_simulation(
    n_steps = 100,
    initial_size = 200
  )
})

9.5.4 Documentation Standards

Function-level documentation: Every exported function needs roxygen2 docs

Inline comments: Explain “why”, not “what”

# Good: Explains reasoning
# Use Blau index for categorical diversity as it handles 
# multiple categories better than simple proportion variance
diversity <- calculate_blau_index(categories)

# Avoid: States the obvious
# Calculate diversity
diversity <- calculate_blau_index(categories)

Complex algorithms: Add block comments

# The Attraction-Selection-Attrition process:
# 1. ATTRACTION: Applicants evaluate org attractiveness based on
#    visible diversity and cultural signals
# 2. SELECTION: Organization evaluates applicants using
#    configured criteria (conscientiousness, fit, etc.)
# 3. ATTRITION: Employees leave based on satisfaction levels
#    accumulated through interactions

9.6 Git Workflow

9.6.1 Branching Strategy

We use Git Flow:

main
  └── develop
        ├── feature/add-risk-tolerance
        ├── feature/improve-performance
        ├── bugfix/hiring-calculation
        └── hotfix/critical-error

9.6.2 Branch Naming Conventions

Features: feature/descriptive-name
Bugfixes: bugfix/issue-description
Hotfixes: hotfix/critical-issue
Releases: release/version-number

9.6.3 Commit Message Conventions

Follow the conventional commits format:

type(scope): subject

body

footer

Examples:

feat(hiring): add risk-based selection criteria

- Implement risk_tolerance attribute for agents
- Add selection strategy based on risk preference
- Update tests for new functionality

Closes #123

fix(turnover): correct satisfaction threshold calculation

The previous implementation used >= instead of >, causing
employees with exactly threshold satisfaction to leave.

perf(interactions): optimize interaction sampling

- Replace nested loops with vectorized operations
- Add index on interaction history
- Reduces step time by ~40% for large organizations

docs(api): update hiring function documentation

- Add examples for new selection criteria
- Clarify parameter descriptions
- Fix typos in return value docs

9.6.4 Pull Request Process

Create feature branch:

git checkout develop
git pull origin develop
git checkout -b feature/your-feature-name

Make changes and commit:

git add -A
git commit -m "feat(module): description"

Keep branch updated:

git checkout develop
git pull origin develop
git checkout feature/your-feature-name
git rebase develop

Push and create PR:

git push origin feature/your-feature-name

PR description template:

## Description
Brief description of changes

## Type of Change
- [ ] Bug fix
- [ ] New feature
- [ ] Breaking change
- [ ] Documentation update

## Testing
- [ ] Unit tests pass
- [ ] Integration tests pass
- [ ] Manual testing completed

## Checklist
- [ ] Code follows style guidelines
- [ ] Self-review completed
- [ ] Comments added for complex sections
- [ ] Documentation updated
- [ ] No warnings in R CMD check

9.6.5 Code Review Guidelines

For Reviewers:

Check for:
- Correctness of implementation
- Test coverage
- Performance implications
- Documentation completeness
- Code style consistency

Provide constructive feedback:

# Good feedback
"Consider using data.table's `.SDcols` here for better performance 
when selecting columns. Example: `DT[, .SD, .SDcols = patterns("^metric_")]`"

# Avoid
"This is inefficient"

For Authors:

Respond to all comments
Mark resolved conversations
Request re-review after changes

9.7 Common Development Tasks

9.7.1 Adding New Parameters

Update parameter list in simulation/engine.R:

default_params <- list(
  # ... existing parameters ...

  # Social network parameters
  network_density = 0.1,        # NEW: How connected the network is
  network_clustering = 0.3,     # NEW: Tendency to form cliques
  weak_tie_strength = 0.5       # NEW: Influence of weak ties
)

Add validation in parameter checking:

validate_parameters <- function(params) {
  assert_number(params$network_density, lower = 0, upper = 1)
  assert_number(params$network_clustering, lower = 0, upper = 1)
  assert_number(params$weak_tie_strength, lower = 0, upper = 1)

  # ... existing validations ...
}

Implement parameter effects:

# In simulate_interactions()
n_interactions <- ceiling(
  params$n_interactions_per_step * params$network_density
)

Document in user guide (docs/04-user-guide.Rmd):

### Network Parameters

- `network_density`: Controls how many interactions occur (0-1)
- `network_clustering`: Tendency to form cliques (0-1)
- `weak_tie_strength`: Influence of distant connections (0-1)

9.7.2 Creating New Metrics

Define calculation function:

#' Calculate network centralization
#' 
#' @param interactions data.table of interaction records
#' @param employees data.table of current employees
#' @return Numeric centralization score (0-1)
calculate_network_centralization <- function(interactions, employees) {
  # Count interactions per employee
  degree_dist <- interactions[
    agent1_id %in% employees$agent_id | 
    agent2_id %in% employees$agent_id
  ][, .(
    in_degree = sum(agent2_id == agent_id),
    out_degree = sum(agent1_id == agent_id)
  ), by = agent_id]

  # Calculate centralization (Freeman's formula)
  max_degree <- max(degree_dist$in_degree + degree_dist$out_degree)
  sum_diff <- sum(max_degree - (degree_dist$in_degree + degree_dist$out_degree))
  max_sum_diff <- (nrow(employees) - 1) * (nrow(employees) - 2)

  centralization <- sum_diff / max_sum_diff
  return(centralization)
}

Integrate into simulation:

# In update_org_metrics()
update_org_metrics <- function(org_state, step) {
  # ... existing metrics ...

  # Calculate network centralization
  if (nrow(org_state$interaction_history) > 0) {
    recent_interactions <- org_state$interaction_history[
      time >= step - org_state$params$interaction_window
    ]

    org_state$metrics$network_centralization <- c(
      org_state$metrics$network_centralization,
      calculate_network_centralization(
        recent_interactions, 
        org_state$employees
      )
    )
  }

  return(org_state)
}

Add visualization:

#' Plot network centralization over time
#' 
#' @param simulation_results Output from run_asa_simulation()
#' @return ggplot object
plot_network_centralization <- function(simulation_results) {
  metrics_df <- data.frame(
    step = seq_along(simulation_results$metrics$network_centralization),
    centralization = simulation_results$metrics$network_centralization
  )

  ggplot(metrics_df, aes(x = step, y = centralization)) +
    geom_line(color = "darkblue", size = 1) +
    geom_smooth(method = "loess", se = TRUE, alpha = 0.2) +
    labs(
      title = "Network Centralization Over Time",
      x = "Simulation Step",
      y = "Centralization (0-1)"
    ) +
    theme_minimal()
}

9.7.3 Implementing New Selection Strategies

Define strategy function:

#' Select applicants based on network potential
#' 
#' @param applicants data.table of applicants
#' @param org_state Current organization state
#' @return data.table with selection_score column
select_by_network_potential <- function(applicants, org_state) {
  # Calculate potential bridge connections
  current_categories <- org_state$employees[, .N, by = identity_category]

  applicants[, ':='(
    # Favor applicants who can bridge underconnected groups
    bridge_potential = sapply(identity_category, function(cat) {
      cat_size <- current_categories[identity_category == cat, N]
      if (is.na(cat_size)) cat_size <- 0

      # Higher score for smaller groups (more bridging potential)
      return(1 / (cat_size + 1))
    }),

    # Also consider social traits
    social_skills = extraversion * 0.5 + agreeableness * 0.3 + openness * 0.2
  )]

  # Combine into selection score
  applicants[, selection_score := bridge_potential * 0.6 + social_skills * 0.4]

  return(applicants)
}

Register strategy in hiring.R:

evaluate_applicants <- function(applicants, org_state, selection_criteria) {
  # ... existing criteria ...

  else if (selection_criteria == "network_potential") {
    applicants <- select_by_network_potential(applicants, org_state)
  }

  # ... rest of function ...
}

Add tests:

test_that("Network potential selection works correctly", {
  # Create org with unbalanced categories
  org_state <- list(
    employees = data.table(
      agent_id = paste0("emp_", 1:10),
      identity_category = c(rep("A", 7), rep("B", 2), "C")
    )
  )

  # Create applicants
  applicants <- data.table(
    agent_id = paste0("app_", 1:3),
    identity_category = c("A", "B", "C"),
    extraversion = c(0, 0, 0),
    agreeableness = c(0, 0, 0),
    openness = c(0, 0, 0)
  )

  # Apply selection
  result <- select_by_network_potential(applicants, org_state)

  # Category C applicant should score highest (smallest group)
  expect_equal(which.max(result$selection_score), 3)
})

9.7.4 Adding Visualization Functions

Create visualization function:

#' Create interactive satisfaction heatmap
#' 
#' @param org_state Organization state with employees
#' @param step Current simulation step
#' @return plotly object
create_satisfaction_heatmap <- function(org_state, step = NULL) {
  # Prepare data
  satisfaction_matrix <- dcast(
    org_state$interaction_history[
      time == max(time)
    ][
      , .(agent1_id, agent2_id, satisfaction)
    ],
    agent1_id ~ agent2_id,
    value.var = "satisfaction",
    fill = NA
  )

  # Convert to matrix
  sat_mat <- as.matrix(satisfaction_matrix[, -1])
  rownames(sat_mat) <- satisfaction_matrix$agent1_id

  # Create heatmap
  plot_ly(
    z = sat_mat,
    x = colnames(sat_mat),
    y = rownames(sat_mat),
    type = "heatmap",
    colorscale = "RdBu",
    zmin = -1,
    zmax = 1,
    text = matrix(
      paste("From:", rep(rownames(sat_mat), ncol(sat_mat)),
            "<br>To:", rep(colnames(sat_mat), each = nrow(sat_mat)),
            "<br>Satisfaction:", round(as.vector(sat_mat), 2)),
      nrow = nrow(sat_mat)
    ),
    hoverinfo = "text"
  ) %>%
  layout(
    title = paste("Interaction Satisfaction Matrix",
                 ifelse(!is.null(step), paste("- Step", step), "")),
    xaxis = list(title = "Agent 2"),
    yaxis = list(title = "Agent 1")
  )
}

Add to visualization suite:

# In create_simulation_dashboard()
create_simulation_dashboard <- function(simulation_results) {
  # ... existing plots ...

  # Add satisfaction heatmap
  satisfaction_heat <- create_satisfaction_heatmap(
    simulation_results$final_state,
    step = simulation_results$params$n_steps
  )

  # Combine into dashboard
  subplot(
    diversity_plot,
    satisfaction_plot,
    satisfaction_heat,
    nrows = 2,
    shareX = FALSE,
    titleY = TRUE
  )
}

9.8 Debugging Tips

9.8.1 Common Errors and Solutions

“object ‘column_name’ not found”

# Problem: Column doesn't exist in data.table
employees[, new_col := old_col * 2]  # Error if old_col missing

# Solution: Check column exists first
if ("old_col" %in% names(employees)) {
  employees[, new_col := old_col * 2]
} else {
  warning("Column 'old_col' not found, skipping calculation")
}

“invalid subscript type ‘list’”

# Problem: Using list column incorrectly
employees[, mean(satisfaction_history)]  # Error

# Solution: Unlist or use sapply
employees[, mean(unlist(satisfaction_history)), by = agent_id]

Memory issues with large simulations

# Problem: Storing full history exhausts memory

# Solution: Implement rolling window storage
store_history <- function(history, new_data, max_size = 1000) {
  history <- rbind(history, new_data)
  if (nrow(history) > max_size) {
    history <- history[-(1:(nrow(history) - max_size))]
  }
  return(history)
}

9.8.2 Debugging Techniques for R

Use browser() for interactive debugging:

calculate_complex_metric <- function(data) {
  # Some preparation
  processed <- prepare_data(data)

  browser()  # Execution stops here

  # Complex calculation
  result <- complex_calculation(processed)
  return(result)
}

Add verbose logging:

run_simulation_step <- function(org_state, step, params) {
  if (params$verbose) {
    cat(sprintf("\n=== Step %d ===\n", step))
    cat(sprintf("Employees: %d\n", nrow(org_state$employees)))
    cat(sprintf("Mean satisfaction: %.3f\n", 
                mean(org_state$employees$satisfaction)))
  }

  # ... rest of function ...
}

Use trace() for non-invasive debugging:

# Add debugging to existing function without modifying source
trace(calculate_diversity_index, 
      tracer = quote(print(paste("Input has", nrow(employees), "rows"))))

# Remove tracing
untrace(calculate_diversity_index)

Defensive programming with assertions:

process_interactions <- function(employees, interactions) {
  # Validate inputs
  assert_data_table(employees)
  assert_data_table(interactions)
  assert_subset(c("agent1_id", "agent2_id"), names(interactions))

  # Check data integrity
  invalid_agents <- setdiff(
    c(interactions$agent1_id, interactions$agent2_id),
    employees$agent_id
  )

  if (length(invalid_agents) > 0) {
    stop(sprintf("Invalid agent IDs in interactions: %s",
                paste(invalid_agents, collapse = ", ")))
  }

  # ... rest of function ...
}

9.8.3 Performance Profiling

Profile entire simulation:

library(profvis)

profvis({
  results <- run_asa_simulation(
    n_steps = 100,
    initial_size = 500,
    params = list(
      n_interactions_per_step = 20,
      verbose = FALSE
    )
  )
})

Benchmark specific functions:

library(microbenchmark)

# Compare different diversity calculations
employees <- create_test_employees(1000)

microbenchmark(
  blau = calculate_diversity_index(employees, "blau"),
  shannon = calculate_diversity_index(employees, "shannon"),
  custom = calculate_diversity_index(employees, "custom"),
  times = 100
)

Memory profiling:

library(pryr)

# Check object sizes
object_size(org_state$employees)
object_size(org_state$interaction_history)

# Track memory usage
mem_before <- mem_used()
results <- run_asa_simulation(n_steps = 50)
mem_after <- mem_used()

cat(sprintf("Memory used: %.2f MB\n", 
           (mem_after - mem_before) / 1024^2))

9.8.4 Using Verbose Mode Effectively

Implement tiered verbosity:

# In parameters
params$verbose_level <- 2  # 0=silent, 1=basic, 2=detailed, 3=debug

# In functions
log_message <- function(message, level = 1, current_level = params$verbose_level) {
  if (current_level >= level) {
    cat(paste0(rep("  ", level - 1), message, "\n"))
  }
}

# Usage
log_message("Starting simulation", level = 1)
log_message("Initializing organization", level = 2)
log_message(sprintf("Created %d employees", nrow(employees)), level = 3)

Add progress bars for long operations:

run_asa_simulation <- function(n_steps, ..., show_progress = TRUE) {
  if (show_progress) {
    pb <- txtProgressBar(min = 0, max = n_steps, style = 3)
  }

  for (step in 1:n_steps) {
    # ... simulation step ...

    if (show_progress) {
      setTxtProgressBar(pb, step)
    }
  }

  if (show_progress) {
    close(pb)
  }
}

9.9 Summary

This contributor’s guide provides a comprehensive walkthrough for developing and extending the ASA ABM v2 codebase. Key takeaways:

Start small: Begin with simple features and gradually tackle more complex ones
Test everything: Write tests before, during, and after implementation
Document as you go: Future you (and other contributors) will thank you
Profile early: Don’t wait until performance becomes a problem
Ask questions: The community is here to help

Remember that good code is not just code that works—it’s code that others can understand, maintain, and build upon. Happy contributing!