Chapter 9 Contributor’s Walkthrough

Welcome to the ASA ABM v2 contributor’s guide! This chapter provides a practical walkthrough for developers who want to contribute to or extend the codebase. Whether you’re adding new features, fixing bugs, or improving performance, this guide will help you navigate the codebase effectively.

9.1 Getting Started as a Contributor

9.1.1 Setting up the Development Environment

  1. Clone the repository:
git clone [repository-url]
cd asa_abm_v2
  1. Install R dependencies:
# Core dependencies
install.packages(c(
  "data.table",      # High-performance data manipulation
  "checkmate",       # Input validation
  "ggplot2",         # Visualization
  "plotly",          # Interactive plots
  "tidyverse",       # Data manipulation utilities
  "testthat",        # Testing framework
  "microbenchmark",  # Performance testing
  "profvis"          # Profiling tool
))

# Documentation dependencies
install.packages(c(
  "bookdown",
  "knitr",
  "rmarkdown"
))
  1. Verify installation:
source("run_simulation.R")

# Run a quick test simulation
test_sim <- run_asa_simulation(
  n_steps = 10,
  initial_size = 20,
  verbose = TRUE
)

9.1.2 Understanding the Codebase Structure

The codebase follows a modular architecture:

asa_abm_v2/
├── core/                  # Core components
│   ├── agent.R           # Agent definitions
│   ├── organization.R    # Organization structure
│   └── interactions.R    # Social interactions
├── simulation/           # Simulation logic
│   ├── engine.R         # Main simulation loop
│   ├── hiring.R         # Hiring processes
│   └── turnover.R       # Turnover mechanisms
├── analysis/            # Analysis tools
├── tests/               # Test suite
├── config/              # Configuration files
└── docs/                # Documentation

9.1.3 Running Tests and Examples

# Run all tests
testthat::test_dir("tests/")

# Run specific test file
testthat::test_file("tests/test_agent.R")

# Run example simulations
source("docs/vignettes/basic_simulation.R")

9.2 Step-by-Step Feature Tutorial: Adding a New Personality Trait

Let’s walk through adding a new personality trait called “risk_tolerance” to the simulation. This tutorial shows exactly which files need modification and how changes flow through the system.

9.2.1 Step 1: Update Agent Definition

First, modify core/agent.R to include the new trait:

# In create_applicant_pool() function around line 30
create_applicant_pool <- function(n_applicants = 50,
                                identity_categories = c("A", "B", "C", "D", "E")) {
  
  # ... existing code ...
  
  # Create applicant pool
  applicants <- data.table(
    agent_id = applicant_ids,
    identity_category = sample(identity_categories, n_applicants, replace = TRUE),
    
    # Big Five personality traits
    openness = rnorm(n_applicants, mean = 0, sd = 1),
    conscientiousness = rnorm(n_applicants, mean = 0, sd = 1),
    extraversion = rnorm(n_applicants, mean = 0, sd = 1),
    agreeableness = rnorm(n_applicants, mean = 0, sd = 1),
    emotional_stability = rnorm(n_applicants, mean = 0, sd = 1),
    
    # NEW TRAIT: Risk tolerance
    risk_tolerance = rnorm(n_applicants, mean = 0, sd = 1),
    
    # Preferences
    diversity_preference = rnorm(n_applicants, mean = 0, sd = 1),
    homophily_preference = rnorm(n_applicants, mean = 0, sd = 1),
    
    # Application state
    attraction = 0,
    application_time = 0
  )
  
  # ... rest of function ...
}

# Also update convert_to_employee() function
convert_to_employee <- function(applicant, hire_time) {
  employee <- copy(applicant)
  employee[, `:=`(
    hire_time = hire_time,
    tenure = 0,
    satisfaction = 0,
    performance = calculate_initial_performance(applicant),
    risk_taking_behavior = 0  # NEW: Initialize behavior based on trait
  )]
  
  return(employee)
}

# NEW FUNCTION: Calculate risk-based performance modifier
calculate_risk_performance_modifier <- function(risk_tolerance) {
  # High risk tolerance can lead to bigger wins or losses
  risk_factor <- pnorm(risk_tolerance) # Convert to 0-1 scale
  modifier <- rnorm(1, mean = 0, sd = risk_factor * 0.2)
  return(modifier)
}

9.2.2 Step 2: Update Organization Initialization

Modify core/organization.R to handle the new trait:

# In initialize_organization() function
initialize_organization <- function(size = 100, 
                                  identity_categories = c("A", "B", "C", "D", "E"),
                                  params = list()) {
  
  # ... existing code ...
  
  # Create initial employees with new trait
  employees <- data.table(
    agent_id = agent_ids,
    identity_category = sample(identity_categories, size, replace = TRUE),
    
    # Personality traits
    openness = rnorm(size, mean = 0, sd = 1),
    conscientiousness = rnorm(size, mean = 0, sd = 1),
    extraversion = rnorm(size, mean = 0, sd = 1),
    agreeableness = rnorm(size, mean = 0, sd = 1),
    emotional_stability = rnorm(size, mean = 0, sd = 1),
    risk_tolerance = rnorm(size, mean = 0, sd = 1),  # NEW
    
    # ... rest of initialization ...
  )
  
  # ... rest of function ...
}

9.2.3 Step 3: Incorporate into Interactions

Update core/interactions.R to use the new trait:

# Add risk tolerance influence to satisfaction calculation
calculate_interaction_satisfaction <- function(agent1, agent2, org_culture = NULL) {
  
  # ... existing similarity calculations ...
  
  # NEW: Risk tolerance compatibility
  risk_compatibility <- 1 - abs(agent1$risk_tolerance - agent2$risk_tolerance) / 4
  
  # Update satisfaction calculation
  satisfaction <- mean(c(
    identity_match,
    personality_similarity,
    risk_compatibility  # NEW component
  ))
  
  # Apply cultural moderation if available
  if (!is.null(org_culture)) {
    culture_fit <- calculate_culture_fit(agent1, org_culture)
    
    # NEW: Risk-taking culture modifier
    if (org_culture$risk_taking_culture > 0) {
      risk_bonus <- agent1$risk_tolerance * org_culture$risk_taking_culture * 0.1
      satisfaction <- satisfaction + risk_bonus
    }
  }
  
  return(satisfaction)
}

9.2.4 Step 4: Update Selection Criteria

Modify simulation/hiring.R to allow selection based on risk tolerance:

# In evaluate_applicants() function
evaluate_applicants <- function(applicants, org_state, selection_criteria = "conscientiousness") {
  
  # ... existing code ...
  
  # Evaluate based on criteria
  if (selection_criteria == "conscientiousness") {
    applicants[, selection_score := conscientiousness]
  } else if (selection_criteria == "cultural_fit") {
    applicants[, selection_score := calculate_cultural_fit(.SD, org_culture), by = agent_id]
  } else if (selection_criteria == "risk_tolerance") {  # NEW
    # Organizations might want risk-takers or risk-averse employees
    target_risk <- org_state$params$target_risk_level %||% 0
    applicants[, selection_score := -abs(risk_tolerance - target_risk)]
  } else if (selection_criteria == "balanced") {
    # NEW: Include risk tolerance in balanced selection
    applicants[, selection_score := (
      conscientiousness * 0.3 +
      emotional_stability * 0.2 +
      agreeableness * 0.2 +
      openness * 0.15 +
      extraversion * 0.1 +
      risk_tolerance * org_state$params$risk_weight %||% 0.05
    )]
  }
  
  # ... rest of function ...
}

9.2.5 Step 5: Add Performance Impact

Update performance calculations in simulation/engine.R:

# Add new function for risk-based events
simulate_risk_events <- function(employees, org_state) {
  # High risk tolerance employees may generate volatile outcomes
  risk_takers <- employees[risk_tolerance > 1]
  
  if (nrow(risk_takers) > 0) {
    risk_takers[, risk_event := runif(.N) < pnorm(risk_tolerance) * 0.1]
    
    # Calculate performance impact
    risk_takers[risk_event == TRUE, performance_modifier := 
                calculate_risk_performance_modifier(risk_tolerance)]
    
    # Update organization metrics
    total_impact <- sum(risk_takers$performance_modifier, na.rm = TRUE)
    org_state$metrics$risk_impact <- c(org_state$metrics$risk_impact, total_impact)
  }
  
  return(employees)
}

# Integrate into main simulation loop
run_simulation_step <- function(org_state, step, params) {
  
  # ... existing step logic ...
  
  # NEW: Simulate risk events
  org_state$employees <- simulate_risk_events(org_state$employees, org_state)
  
  # ... rest of step ...
}

9.2.6 Step 6: Add Tests

Create tests/test_risk_tolerance.R:

library(testthat)
library(data.table)

# Source required files
source("../core/agent.R")
source("../core/organization.R")
source("../core/interactions.R")

test_that("Risk tolerance is properly initialized", {
  # Test applicant creation
  applicants <- create_applicant_pool(n_applicants = 100)
  
  expect_true("risk_tolerance" %in% names(applicants))
  expect_equal(length(applicants$risk_tolerance), 100)
  expect_true(all(!is.na(applicants$risk_tolerance)))
  
  # Test distribution
  expect_true(abs(mean(applicants$risk_tolerance)) < 0.2)
  expect_true(abs(sd(applicants$risk_tolerance) - 1) < 0.2)
})

test_that("Risk tolerance affects interactions", {
  # Create two agents with different risk tolerances
  agent1 <- data.table(
    agent_id = "test1",
    risk_tolerance = 2,
    openness = 0, conscientiousness = 0,
    extraversion = 0, agreeableness = 0,
    emotional_stability = 0,
    identity_category = "A"
  )
  
  agent2 <- data.table(
    agent_id = "test2", 
    risk_tolerance = -2,
    openness = 0, conscientiousness = 0,
    extraversion = 0, agreeableness = 0,
    emotional_stability = 0,
    identity_category = "A"
  )
  
  # Test interaction satisfaction
  satisfaction <- calculate_interaction_satisfaction(agent1, agent2)
  
  # Should be lower due to risk tolerance difference
  expect_true(satisfaction < 0.7)
})

test_that("Risk-based selection works correctly", {
  applicants <- create_applicant_pool(n_applicants = 50)
  
  org_state <- list(
    params = list(target_risk_level = 1.5),
    culture = list(risk_taking_culture = 0.5)
  )
  
  evaluated <- evaluate_applicants(
    applicants, 
    org_state, 
    selection_criteria = "risk_tolerance"
  )
  
  # Check that selection scores favor applicants close to target
  best_applicant <- evaluated[which.max(selection_score)]
  expect_true(abs(best_applicant$risk_tolerance - 1.5) < 1)
})

test_that("Risk performance modifier calculates correctly", {
  # Test multiple risk levels
  low_risk <- calculate_risk_performance_modifier(-2)
  high_risk <- calculate_risk_performance_modifier(2)
  
  # High risk should have higher variance
  set.seed(123)
  low_risk_mods <- replicate(1000, calculate_risk_performance_modifier(-2))
  high_risk_mods <- replicate(1000, calculate_risk_performance_modifier(2))
  
  expect_true(var(high_risk_mods) > var(low_risk_mods))
})

9.2.7 Step 7: Update Documentation

Add to docs/06-api-reference.Rmd:

### Agent Attributes

Agents in the simulation have the following attributes:

- **risk_tolerance**: Normal(0, 1) - Propensity for risk-taking behavior
  - Values > 1: Risk-seeking individuals
  - Values < -1: Risk-averse individuals
  - Affects performance variability and interaction satisfaction

9.3 Code Flow Walkthrough

9.3.1 Simulation Step Trace

Here’s a detailed trace through one simulation step:

# STEP 1: Main loop calls run_simulation_step()
run_asa_simulation()
  └── for (step in 1:n_steps)
        └── run_simulation_step(org_state, step, params)

# STEP 2: Check for hiring needs
run_simulation_step()
  ├── if (step %% params$hiring_frequency == 0)
  │     └── perform_hiring_round()
  │           ├── create_applicant_pool()
  │           ├── calculate_org_attractiveness()
  │           ├── evaluate_applicants()
  │           └── hire_applicants()

  # STEP 3: Process interactions
  ├── simulate_interactions()
  │     ├── sample_interaction_pairs()
  │     ├── for each pair:
  │     │     └── calculate_interaction_satisfaction()
  │     └── update_satisfaction_history()

  # STEP 4: Calculate turnover
  ├── process_turnover()
  │     ├── calculate_turnover_probability()
  │     ├── identify_leavers()
  │     └── remove_employees()

  # STEP 5: Update metrics
  └── update_org_metrics()
        ├── calculate_diversity_index()
        ├── calculate_mean_satisfaction()
        ├── calculate_performance_metrics()
        └── store_step_history()

9.3.2 Data Flow Diagram

┌─────────────────┐     ┌──────────────────┐     ┌─────────────────┐
│ Applicant Pool  │────▶│ Organization     │────▶│ History/Metrics │
└─────────────────┘     └──────────────────┘     └─────────────────┘
        │                        │                         │
        │                        ▼                         │
        │               ┌──────────────────┐              │
        │               │    Employees     │              │
        │               └──────────────────┘              │
        │                        │                         │
        │                        ▼                         │
        │               ┌──────────────────┐              │
        └──────────────▶│  Interactions    │──────────────┘
                        └──────────────────┘

9.3.3 Key Functions and Responsibilities

Module Function Responsibility
engine.R run_asa_simulation() Main entry point, orchestrates simulation
engine.R run_simulation_step() Executes one time step
agent.R create_applicant_pool() Generates new applicants
organization.R initialize_organization() Sets up initial state
hiring.R perform_hiring_round() Manages hiring process
interactions.R simulate_interactions() Handles social dynamics
turnover.R process_turnover() Manages employee departures

9.4 Testing Philosophy

9.4.1 Types of Tests

  1. Unit Tests: Test individual functions in isolation

    test_that("Diversity index calculates correctly", {
      employees <- data.table(
        identity_category = c("A", "A", "B", "B", "C")
      )
    
      blau_index <- calculate_diversity_index(employees, method = "blau")
      expect_equal(blau_index, 0.64)  # 1 - (0.4^2 + 0.4^2 + 0.2^2)
    })
  2. Integration Tests: Test component interactions

    test_that("Hiring process integrates correctly", {
      org_state <- initialize_organization(size = 10)
      initial_size <- nrow(org_state$employees)
    
      org_state <- perform_hiring_round(org_state, n_to_hire = 5)
    
      expect_equal(nrow(org_state$employees), initial_size + 5)
      expect_true(all(org_state$employees$hire_time >= 0))
    })
  3. Regression Tests: Ensure changes don’t break existing functionality

    test_that("Simulation produces consistent results", {
      set.seed(12345)
      sim1 <- run_asa_simulation(n_steps = 10, initial_size = 20)
    
      set.seed(12345)
      sim2 <- run_asa_simulation(n_steps = 10, initial_size = 20)
    
      expect_identical(sim1$metrics, sim2$metrics)
    })

9.4.2 Writing Good Tests

Follow the AAA pattern:

test_that("Employee satisfaction updates correctly", {
  # ARRANGE - Set up test data
  employee <- data.table(
    agent_id = "emp_001",
    satisfaction = 0,
    satisfaction_history = list(numeric(0))
  )
  
  # ACT - Execute the function
  updated_employee <- update_satisfaction(employee, new_satisfaction = 0.7)
  
  # ASSERT - Check results
  expect_equal(updated_employee$satisfaction, 0.7)
  expect_length(updated_employee$satisfaction_history[[1]], 1)
  expect_equal(updated_employee$satisfaction_history[[1]][1], 0.7)
})

9.4.3 Test Coverage Expectations

  • Core functions: 100% coverage required
  • Utility functions: 90% coverage minimum
  • Edge cases: Must test boundary conditions
  • Error handling: Test invalid inputs

Check coverage with:

library(covr)
cov <- package_coverage()
report(cov)

9.4.4 Running the Test Suite

# Run all tests with detailed output
testthat::test_dir("tests/", reporter = "progress")

# Run tests for a specific module
testthat::test_file("tests/test_hiring.R")

# Run tests matching a pattern
testthat::test_dir("tests/", filter = "diversity")

# Run with coverage report
covr::report(covr::package_coverage())

9.5 Development Best Practices

9.5.1 R Coding Standards

  1. Function naming: Use descriptive verb_noun format

    # Good
    calculate_diversity_index()
    update_employee_satisfaction()
    
    # Avoid
    divIndex()
    empSatUpd()
  2. Variable naming: Use snake_case for variables

    # Good
    employee_count <- nrow(employees)
    mean_satisfaction <- mean(employees$satisfaction)
    
    # Avoid
    employeeCount <- nrow(employees)
    meanSat <- mean(employees$satisfaction)
  3. Function documentation: Use roxygen2 format

    #' Calculate organization's cultural distance from target
    #' 
    #' @param org_culture Current organizational culture (list)
    #' @param target_culture Target cultural values (list)
    #' @param weights Optional weights for culture dimensions
    #' @return Numeric distance score (0 = identical, higher = more different)
    #' @examples
    #' culture <- list(innovation = 0.7, stability = 0.3)
    #' target <- list(innovation = 0.9, stability = 0.1)
    #' calculate_cultural_distance(culture, target)
    calculate_cultural_distance <- function(org_culture, target_culture, weights = NULL) {
      # Implementation
    }

9.5.2 Using data.table Effectively

  1. Reference semantics: Use := for in-place updates

    # Efficient - modifies in place
    employees[, satisfaction := satisfaction + 0.1]
    
    # Inefficient - creates copy
    employees$satisfaction <- employees$satisfaction + 0.1
  2. Group operations: Use by for grouped calculations

    # Calculate mean satisfaction by identity category
    employees[, .(
      mean_satisfaction = mean(satisfaction),
      count = .N
    ), by = identity_category]
  3. Keys for performance: Set keys on frequently joined columns

    setkey(employees, agent_id)
    setkey(interactions, agent1_id, agent2_id)
  4. Chaining operations: Use data.table chaining

    employees[
      tenure > 12
    ][
      , satisfaction_rank := rank(-satisfaction)
    ][
      satisfaction_rank <= 10
    ]

9.5.3 Performance Considerations

  1. Preallocate memory:

    # Good - preallocate
    results <- vector("numeric", n_steps)
    for (i in 1:n_steps) {
      results[i] <- calculate_metric(i)
    }
    
    # Avoid - growing vectors
    results <- c()
    for (i in 1:n_steps) {
      results <- c(results, calculate_metric(i))
    }
  2. Vectorize operations:

    # Good - vectorized
    employees[, performance := conscientiousness * 0.3 + 
                               emotional_stability * 0.2 + 
                               satisfaction * 0.5]
    
    # Avoid - row-by-row
    for (i in 1:nrow(employees)) {
      employees$performance[i] <- employees$conscientiousness[i] * 0.3 +
                                  employees$emotional_stability[i] * 0.2 +
                                  employees$satisfaction[i] * 0.5
    }
  3. Profile bottlenecks:

    library(profvis)
    
    profvis({
      sim_results <- run_asa_simulation(
        n_steps = 100,
        initial_size = 200
      )
    })

9.5.4 Documentation Standards

  1. Function-level documentation: Every exported function needs roxygen2 docs

  2. Inline comments: Explain “why”, not “what”

    # Good: Explains reasoning
    # Use Blau index for categorical diversity as it handles 
    # multiple categories better than simple proportion variance
    diversity <- calculate_blau_index(categories)
    
    # Avoid: States the obvious
    # Calculate diversity
    diversity <- calculate_blau_index(categories)
  3. Complex algorithms: Add block comments

    # The Attraction-Selection-Attrition process:
    # 1. ATTRACTION: Applicants evaluate org attractiveness based on
    #    visible diversity and cultural signals
    # 2. SELECTION: Organization evaluates applicants using
    #    configured criteria (conscientiousness, fit, etc.)
    # 3. ATTRITION: Employees leave based on satisfaction levels
    #    accumulated through interactions

9.6 Git Workflow

9.6.1 Branching Strategy

We use Git Flow:

main
  └── develop
        ├── feature/add-risk-tolerance
        ├── feature/improve-performance
        ├── bugfix/hiring-calculation
        └── hotfix/critical-error

9.6.2 Branch Naming Conventions

  • Features: feature/descriptive-name
  • Bugfixes: bugfix/issue-description
  • Hotfixes: hotfix/critical-issue
  • Releases: release/version-number

9.6.3 Commit Message Conventions

Follow the conventional commits format:

type(scope): subject

body

footer

Examples:

feat(hiring): add risk-based selection criteria

- Implement risk_tolerance attribute for agents
- Add selection strategy based on risk preference
- Update tests for new functionality

Closes #123

fix(turnover): correct satisfaction threshold calculation

The previous implementation used >= instead of >, causing
employees with exactly threshold satisfaction to leave.

perf(interactions): optimize interaction sampling

- Replace nested loops with vectorized operations
- Add index on interaction history
- Reduces step time by ~40% for large organizations

docs(api): update hiring function documentation

- Add examples for new selection criteria
- Clarify parameter descriptions
- Fix typos in return value docs

9.6.4 Pull Request Process

  1. Create feature branch:

    git checkout develop
    git pull origin develop
    git checkout -b feature/your-feature-name
  2. Make changes and commit:

    git add -A
    git commit -m "feat(module): description"
  3. Keep branch updated:

    git checkout develop
    git pull origin develop
    git checkout feature/your-feature-name
    git rebase develop
  4. Push and create PR:

    git push origin feature/your-feature-name
  5. PR description template:

    ## Description
    Brief description of changes
    
    ## Type of Change
    - [ ] Bug fix
    - [ ] New feature
    - [ ] Breaking change
    - [ ] Documentation update
    
    ## Testing
    - [ ] Unit tests pass
    - [ ] Integration tests pass
    - [ ] Manual testing completed
    
    ## Checklist
    - [ ] Code follows style guidelines
    - [ ] Self-review completed
    - [ ] Comments added for complex sections
    - [ ] Documentation updated
    - [ ] No warnings in R CMD check

9.6.5 Code Review Guidelines

For Reviewers:

  1. Check for:

    • Correctness of implementation
    • Test coverage
    • Performance implications
    • Documentation completeness
    • Code style consistency
  2. Provide constructive feedback:

    # Good feedback
    "Consider using data.table's `.SDcols` here for better performance 
    when selecting columns. Example: `DT[, .SD, .SDcols = patterns("^metric_")]`"
    
    # Avoid
    "This is inefficient"

For Authors:

  1. Respond to all comments
  2. Mark resolved conversations
  3. Request re-review after changes

9.7 Common Development Tasks

9.7.1 Adding New Parameters

  1. Update parameter list in simulation/engine.R:

    default_params <- list(
      # ... existing parameters ...
    
      # Social network parameters
      network_density = 0.1,        # NEW: How connected the network is
      network_clustering = 0.3,     # NEW: Tendency to form cliques
      weak_tie_strength = 0.5       # NEW: Influence of weak ties
    )
  2. Add validation in parameter checking:

    validate_parameters <- function(params) {
      assert_number(params$network_density, lower = 0, upper = 1)
      assert_number(params$network_clustering, lower = 0, upper = 1)
      assert_number(params$weak_tie_strength, lower = 0, upper = 1)
    
      # ... existing validations ...
    }
  3. Implement parameter effects:

    # In simulate_interactions()
    n_interactions <- ceiling(
      params$n_interactions_per_step * params$network_density
    )
  4. Document in user guide (docs/04-user-guide.Rmd):

    ### Network Parameters
    
    - `network_density`: Controls how many interactions occur (0-1)
    - `network_clustering`: Tendency to form cliques (0-1)
    - `weak_tie_strength`: Influence of distant connections (0-1)

9.7.2 Creating New Metrics

  1. Define calculation function:

    #' Calculate network centralization
    #' 
    #' @param interactions data.table of interaction records
    #' @param employees data.table of current employees
    #' @return Numeric centralization score (0-1)
    calculate_network_centralization <- function(interactions, employees) {
      # Count interactions per employee
      degree_dist <- interactions[
        agent1_id %in% employees$agent_id | 
        agent2_id %in% employees$agent_id
      ][, .(
        in_degree = sum(agent2_id == agent_id),
        out_degree = sum(agent1_id == agent_id)
      ), by = agent_id]
    
      # Calculate centralization (Freeman's formula)
      max_degree <- max(degree_dist$in_degree + degree_dist$out_degree)
      sum_diff <- sum(max_degree - (degree_dist$in_degree + degree_dist$out_degree))
      max_sum_diff <- (nrow(employees) - 1) * (nrow(employees) - 2)
    
      centralization <- sum_diff / max_sum_diff
      return(centralization)
    }
  2. Integrate into simulation:

    # In update_org_metrics()
    update_org_metrics <- function(org_state, step) {
      # ... existing metrics ...
    
      # Calculate network centralization
      if (nrow(org_state$interaction_history) > 0) {
        recent_interactions <- org_state$interaction_history[
          time >= step - org_state$params$interaction_window
        ]
    
        org_state$metrics$network_centralization <- c(
          org_state$metrics$network_centralization,
          calculate_network_centralization(
            recent_interactions, 
            org_state$employees
          )
        )
      }
    
      return(org_state)
    }
  3. Add visualization:

    #' Plot network centralization over time
    #' 
    #' @param simulation_results Output from run_asa_simulation()
    #' @return ggplot object
    plot_network_centralization <- function(simulation_results) {
      metrics_df <- data.frame(
        step = seq_along(simulation_results$metrics$network_centralization),
        centralization = simulation_results$metrics$network_centralization
      )
    
      ggplot(metrics_df, aes(x = step, y = centralization)) +
        geom_line(color = "darkblue", size = 1) +
        geom_smooth(method = "loess", se = TRUE, alpha = 0.2) +
        labs(
          title = "Network Centralization Over Time",
          x = "Simulation Step",
          y = "Centralization (0-1)"
        ) +
        theme_minimal()
    }

9.7.3 Implementing New Selection Strategies

  1. Define strategy function:

    #' Select applicants based on network potential
    #' 
    #' @param applicants data.table of applicants
    #' @param org_state Current organization state
    #' @return data.table with selection_score column
    select_by_network_potential <- function(applicants, org_state) {
      # Calculate potential bridge connections
      current_categories <- org_state$employees[, .N, by = identity_category]
    
      applicants[, ':='(
        # Favor applicants who can bridge underconnected groups
        bridge_potential = sapply(identity_category, function(cat) {
          cat_size <- current_categories[identity_category == cat, N]
          if (is.na(cat_size)) cat_size <- 0
    
          # Higher score for smaller groups (more bridging potential)
          return(1 / (cat_size + 1))
        }),
    
        # Also consider social traits
        social_skills = extraversion * 0.5 + agreeableness * 0.3 + openness * 0.2
      )]
    
      # Combine into selection score
      applicants[, selection_score := bridge_potential * 0.6 + social_skills * 0.4]
    
      return(applicants)
    }
  2. Register strategy in hiring.R:

    evaluate_applicants <- function(applicants, org_state, selection_criteria) {
      # ... existing criteria ...
    
      else if (selection_criteria == "network_potential") {
        applicants <- select_by_network_potential(applicants, org_state)
      }
    
      # ... rest of function ...
    }
  3. Add tests:

    test_that("Network potential selection works correctly", {
      # Create org with unbalanced categories
      org_state <- list(
        employees = data.table(
          agent_id = paste0("emp_", 1:10),
          identity_category = c(rep("A", 7), rep("B", 2), "C")
        )
      )
    
      # Create applicants
      applicants <- data.table(
        agent_id = paste0("app_", 1:3),
        identity_category = c("A", "B", "C"),
        extraversion = c(0, 0, 0),
        agreeableness = c(0, 0, 0),
        openness = c(0, 0, 0)
      )
    
      # Apply selection
      result <- select_by_network_potential(applicants, org_state)
    
      # Category C applicant should score highest (smallest group)
      expect_equal(which.max(result$selection_score), 3)
    })

9.7.4 Adding Visualization Functions

  1. Create visualization function:

    #' Create interactive satisfaction heatmap
    #' 
    #' @param org_state Organization state with employees
    #' @param step Current simulation step
    #' @return plotly object
    create_satisfaction_heatmap <- function(org_state, step = NULL) {
      # Prepare data
      satisfaction_matrix <- dcast(
        org_state$interaction_history[
          time == max(time)
        ][
          , .(agent1_id, agent2_id, satisfaction)
        ],
        agent1_id ~ agent2_id,
        value.var = "satisfaction",
        fill = NA
      )
    
      # Convert to matrix
      sat_mat <- as.matrix(satisfaction_matrix[, -1])
      rownames(sat_mat) <- satisfaction_matrix$agent1_id
    
      # Create heatmap
      plot_ly(
        z = sat_mat,
        x = colnames(sat_mat),
        y = rownames(sat_mat),
        type = "heatmap",
        colorscale = "RdBu",
        zmin = -1,
        zmax = 1,
        text = matrix(
          paste("From:", rep(rownames(sat_mat), ncol(sat_mat)),
                "<br>To:", rep(colnames(sat_mat), each = nrow(sat_mat)),
                "<br>Satisfaction:", round(as.vector(sat_mat), 2)),
          nrow = nrow(sat_mat)
        ),
        hoverinfo = "text"
      ) %>%
      layout(
        title = paste("Interaction Satisfaction Matrix",
                     ifelse(!is.null(step), paste("- Step", step), "")),
        xaxis = list(title = "Agent 2"),
        yaxis = list(title = "Agent 1")
      )
    }
  2. Add to visualization suite:

    # In create_simulation_dashboard()
    create_simulation_dashboard <- function(simulation_results) {
      # ... existing plots ...
    
      # Add satisfaction heatmap
      satisfaction_heat <- create_satisfaction_heatmap(
        simulation_results$final_state,
        step = simulation_results$params$n_steps
      )
    
      # Combine into dashboard
      subplot(
        diversity_plot,
        satisfaction_plot,
        satisfaction_heat,
        nrows = 2,
        shareX = FALSE,
        titleY = TRUE
      )
    }

9.8 Debugging Tips

9.8.1 Common Errors and Solutions

  1. “object ‘column_name’ not found”

    # Problem: Column doesn't exist in data.table
    employees[, new_col := old_col * 2]  # Error if old_col missing
    
    # Solution: Check column exists first
    if ("old_col" %in% names(employees)) {
      employees[, new_col := old_col * 2]
    } else {
      warning("Column 'old_col' not found, skipping calculation")
    }
  2. “invalid subscript type ‘list’”

    # Problem: Using list column incorrectly
    employees[, mean(satisfaction_history)]  # Error
    
    # Solution: Unlist or use sapply
    employees[, mean(unlist(satisfaction_history)), by = agent_id]
  3. Memory issues with large simulations

    # Problem: Storing full history exhausts memory
    
    # Solution: Implement rolling window storage
    store_history <- function(history, new_data, max_size = 1000) {
      history <- rbind(history, new_data)
      if (nrow(history) > max_size) {
        history <- history[-(1:(nrow(history) - max_size))]
      }
      return(history)
    }

9.8.2 Debugging Techniques for R

  1. Use browser() for interactive debugging:

    calculate_complex_metric <- function(data) {
      # Some preparation
      processed <- prepare_data(data)
    
      browser()  # Execution stops here
    
      # Complex calculation
      result <- complex_calculation(processed)
      return(result)
    }
  2. Add verbose logging:

    run_simulation_step <- function(org_state, step, params) {
      if (params$verbose) {
        cat(sprintf("\n=== Step %d ===\n", step))
        cat(sprintf("Employees: %d\n", nrow(org_state$employees)))
        cat(sprintf("Mean satisfaction: %.3f\n", 
                    mean(org_state$employees$satisfaction)))
      }
    
      # ... rest of function ...
    }
  3. Use trace() for non-invasive debugging:

    # Add debugging to existing function without modifying source
    trace(calculate_diversity_index, 
          tracer = quote(print(paste("Input has", nrow(employees), "rows"))))
    
    # Remove tracing
    untrace(calculate_diversity_index)
  4. Defensive programming with assertions:

    process_interactions <- function(employees, interactions) {
      # Validate inputs
      assert_data_table(employees)
      assert_data_table(interactions)
      assert_subset(c("agent1_id", "agent2_id"), names(interactions))
    
      # Check data integrity
      invalid_agents <- setdiff(
        c(interactions$agent1_id, interactions$agent2_id),
        employees$agent_id
      )
    
      if (length(invalid_agents) > 0) {
        stop(sprintf("Invalid agent IDs in interactions: %s",
                    paste(invalid_agents, collapse = ", ")))
      }
    
      # ... rest of function ...
    }

9.8.3 Performance Profiling

  1. Profile entire simulation:

    library(profvis)
    
    profvis({
      results <- run_asa_simulation(
        n_steps = 100,
        initial_size = 500,
        params = list(
          n_interactions_per_step = 20,
          verbose = FALSE
        )
      )
    })
  2. Benchmark specific functions:

    library(microbenchmark)
    
    # Compare different diversity calculations
    employees <- create_test_employees(1000)
    
    microbenchmark(
      blau = calculate_diversity_index(employees, "blau"),
      shannon = calculate_diversity_index(employees, "shannon"),
      custom = calculate_diversity_index(employees, "custom"),
      times = 100
    )
  3. Memory profiling:

    library(pryr)
    
    # Check object sizes
    object_size(org_state$employees)
    object_size(org_state$interaction_history)
    
    # Track memory usage
    mem_before <- mem_used()
    results <- run_asa_simulation(n_steps = 50)
    mem_after <- mem_used()
    
    cat(sprintf("Memory used: %.2f MB\n", 
               (mem_after - mem_before) / 1024^2))

9.8.4 Using Verbose Mode Effectively

  1. Implement tiered verbosity:

    # In parameters
    params$verbose_level <- 2  # 0=silent, 1=basic, 2=detailed, 3=debug
    
    # In functions
    log_message <- function(message, level = 1, current_level = params$verbose_level) {
      if (current_level >= level) {
        cat(paste0(rep("  ", level - 1), message, "\n"))
      }
    }
    
    # Usage
    log_message("Starting simulation", level = 1)
    log_message("Initializing organization", level = 2)
    log_message(sprintf("Created %d employees", nrow(employees)), level = 3)
  2. Add progress bars for long operations:

    run_asa_simulation <- function(n_steps, ..., show_progress = TRUE) {
      if (show_progress) {
        pb <- txtProgressBar(min = 0, max = n_steps, style = 3)
      }
    
      for (step in 1:n_steps) {
        # ... simulation step ...
    
        if (show_progress) {
          setTxtProgressBar(pb, step)
        }
      }
    
      if (show_progress) {
        close(pb)
      }
    }

9.9 Summary

This contributor’s guide provides a comprehensive walkthrough for developing and extending the ASA ABM v2 codebase. Key takeaways:

  1. Start small: Begin with simple features and gradually tackle more complex ones
  2. Test everything: Write tests before, during, and after implementation
  3. Document as you go: Future you (and other contributors) will thank you
  4. Profile early: Don’t wait until performance becomes a problem
  5. Ask questions: The community is here to help

Remember that good code is not just code that works—it’s code that others can understand, maintain, and build upon. Happy contributing!