Learning Social Behavior with Reinforcement

This project was independently developed as part of the Seminar in Cognitive Science (COG403) at the University of Toronto.

Role

Cognitive Modeler, Python Developer, Researcher

Tools

pyClarion (Cognitive Architecture), Python, NumPy, Matplotlib

Focus Areas

Human-Like AI, Reinforcement Learning, Behavioral Modeling, Social Cognition

Timeline

January-April 2025

Contributions


  • Developed an end-to-end simulation pipeline in Python using pyClarion, integrating custom learning modules to evaluate multiple social learning scenarios in a dynamic environment.
  • Designed and implemented a dual-process learning agent using Rescorla-Wagner-based mechanisms for both stimulus-response and stimulus-value learning.
  • Created four targeted training scenarios to explore how social behaviors like imitation, threat avoidance, and social approach emerge through reinforcement learning.
  • Generated custom data visualizations to track key metrics such as average reward, Q-value progression, and associative strength across 900+ training trials
  • Conducted in-depth analysis of learning patterns to assess the model’s adaptability and the effectiveness of associative mechanisms in producing socially relevant behavior.

Project Overview


Can machines learn to imitate, avoid danger, or respond socially without explicit instructions?

In this simulation-based project, I modeled human social learning through purely associatve mechanisms, using the CLARION cognitive architecture and a custom implementation of the Rescorla-Wagner algorithm. By eliminating symbolic reasoning and relying solely on reinforcement-based learning, I explored how adaptive behaviors–such as imitation, social approach, and avoidance–can emerge from domain-general learning processes.


The results demonstrate that associative mechanisms alone can produce complex, socially relevant behaviors typically attributed to higher-level cognition. This offers meaningful insights into the foundational processes of social behavior in both humans and artificial agents.

Architecture


The agent architecture consists of:

  • Input process: encodes environmental stimuli into social or non-social categories.
  • Rescorla-Wagner Learners:
    • Stimulus-Response Learning
    • Stimulus-Value Learning
  • Choice Process: Selects actions based on learned associations and value estimates

Implemented in pyClarion, the system excludes symbolic reasoning and instead relies solely on reinforcement-driven associations updates.

Scenarios Simulated


  1. Social Presence -> Approach
  2. Behavioral Imitation
  3. Stimulus X -> Approach
  4. Predator Warning -> Escape

Each trial randomly sampled one of these contexts over 900 iterations, with periodic resets to simulate changing environmental demands.

Key Findings & Visualizations


Agent Confidence Over Time


Agent’s internal confidence (Q-values) steadily increased over time, indicating effective learning and convergence toward stable action policies.

Emergence of Social Behaviors


Distinct stimulus-response associations emerged across scenarios, showing how the agent learned behaviors like imitation, escape, and approach through reinforcement.

Social Reward Trends


Average reward per trial fluctuated across conditions, reflecting the complexity of learning in mixed social and non-social environments.

Impact & Insights


  • Demonstrated that complex behaviors like imitation can emerge from low-level reinforcement learning, without relying on explicit symbolic reasoning.
  • Provided a proof-of-concept for how AI agents could aquire social intelligence through trial-and-error learning alone.
  • Generated insights into cognitive flexibility, cue salience, and adaptive behavior modeling across both artificial and biological systems.

Significance


This model lays the groundwork for developing more human-like artificial agents that can learn socially through observation and interaction–key for applications in human-robot interaction, AI safety, and adaptive user modeling.

Future Work


To better emulate human-level social cognition, future iterations could incorporate:

  • Episodic memory structures
  • Hierarchical or symbolic reasoning modules
  • Multi-agent social environments with variable reward conditions