Skip to content

dav3-b/DBLPC

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Differentiation of Behaviours in Learning Pheromone-based Communication

This document details the code, parameters, and configurations used to obtain the results described in Differentiation of Behaviours in Learning Pheromone-based Communication article.


⚙️ Installation

  1. Make sure you have Python 3.10 installed.

  2. Create a virtual environment:

    python -m venv path_to_new_virtual_env 
    source path_to_new_virtual_env/bin/activate
  3. Clone this repository:

    git clone https://github.com/dav3-b/DBLPC.git
    cd DBLPC
  4. Install the required dependencies:

    pip install -r requirements.txt

Operating system: The code was tested on Ubuntu 22.04, CPU only.


📂 Project Structure

DBLPC/
├── agents                      # Algorithms folder
│   ├── IQLearning              # Indipendent Q-Learning implementation
│   │   └── config              # Algorithm configuration files
│   ├── NoLearning              # Deterministic policy implementation
│   └── utils                   # Utility functions
└── environments                # Multi-agent environments
    └── slime                   # Slime environment
        └── config              # Env configuration files 

🚀 Running the Code

IQLearning

The main script is slime_iql.py, which accepts the following command-line arguments:

Argument Type Default value Description
--train bool False If True, training of the agents will be performed, else evaluation.
--random_seed int 42 Change the default random seed for reproducibility.
--qtable_path str None Path to a .npy file for loading the Q-table to perform evaluation.
--render bool False If True, renders the environment visually.

Example: Training run

python slime_iql.py --train True --random_seed 99 

The qtable will be automatically put in the ./runs/weights folder.

Example: Evaluation run

python slime_iql.py --random_seed 99 --qtable_path ./runs/weights/file_name.npy --render True

Deterministic Policy

The main script is slime_deterministic.py, which accepts the following command-line arguments:

Argument Type Default value Description
--random_seed int 42 Change the default random seed for reproducibility.
--episodes int 2000 Number of episodes.
--render bool False If True, renders the environment visually.

Run example

python slime_deterministic.py --random_seed 99 --episodes 100 --render True 

⚙️ Key Parameters

Environment

Parameter Tested Values Used Value Description
World-size [(20x20), (22x22), (25x25), (31x31)] (22x22) Size of the grid world where agents move (torus).
Clutering-population ($N_C$) [20, 14, 10, 6, 0] [20, 14, 10, 6, 0] Number of clustering agents.
Scattering-population ($N_S$) [20, 14, 10, 6, 0] [20, 14, 10, 6, 0] Number of scattering agents.
Sniff-threshold [20, 14, 10, 6, 0] 0.9 Minimum amount of pheromone that can be
smelled by an agent.
Sniff-patches [3, 5, 8] 5 Number of 1-hop neighboring patches in which the agent
can smell the pheromone.
Wiggle-patches [3, 5, 8] 5 Number of 1-hop neighboring patches the agent can move
randomly through.
Diffuse-area [0.5, 1.0, 1.5] 0.5 Standard deviation value of the Gaussian function used to
spread the pheromone in the environment.
Diffuse-radius 1.0 1.0 Radius of the Gaussian function used to spread
the pheromone in the environment.
Evaporation-rate [0.8, 0.85, 0.9, 0.95] 0.95 Amount of pheromone not evaporating in the environment.
Lay-area [0, 1] 1 Number of patches in which the pheromone is released.
Lay-amount [1, 2, 3, 5] 3 Amount of pheromone deposited evenly in Lay-area.

Learning

Parameter Tested Values Used Value Description
Cluster-threshold ($c_{th}$) [1, 5, 10] 1 Minimum amount of agents within cluster-radius to check
clustering.
Cluster-radius [1, 2, 3] 1 Distance (in number of patches) centered in the agents to
control clustering, it is used for calculating rewards and
metrics.
Clustering-reward ($r_C$) [0, 1, 10, 100] 10 Base reward given upon clustering.
Clustering-penalty ($p_C$) [0, -1, -10, -100] -1 Base penalty given for not clustering.
Scattering-reward ($r_S$) [0, 1, 10, 100] 0 Base reward given upon scattering.
Scattering-penalty ($p_S$) [0, -1, -10, -100] -1 Base penalty given for not scattering.
Ticks-per-episode [250, 500, 1000] 500 Learning episode duration in simulation ticks.
episodes [3000, 5000, 10000] 3000 Number of learning episodes.
learning-rate ($\alpha$) [0.01, 0.025, 0.05, 0.1] [0.01, 0.025, 0.05] Magnitude of Q-values updates.
discount-factor ($\gamma$) [0.9, 0.95, 0.99, 0.999] [0.9, 0.99] How much future rewards are given value.
epsilon-init ($\epsilon_{init}$) 1.0 1.0 Initial exploration rate.
epsilon-min ($\epsilon_{min}$) [5e−3, 1e−3, 5e−4, 1e−4, 0.0] 0.0 Minimum value of epsilon.
epsilon-decay ($\lambda$) [0.995, 0.997, 0.999] 0.995 How much epsilon lowers after each action,
it goes from ($\epsilon_{init}$) to ($\epsilon_{min}$).

🛠️ Configuration Files

The following .json configuration files are used to manage the experiment's parameters:

File Name Purpose
/environments/slime/config/env-params.json Defines the environment settings.
/environments/slime/config/env_visualizer.json Controls the rendering configuration for visualizing the environment
/agent/IQLearning/config/learning-params.json Contains learning-related parameters such as learning rate, epsilon decay, etc.
/agent/IQLearning/config/logger-params.json Configures the logging behavior and export mode.

📈 Evaluation Metrics

All evaluation metrics described in the paper are automatically logged in /runs/train (for training) and in /runs/eval(for evaluation).


💾 Reproducibility

Ours paper results presents the average result of 10 identical experiments conducted on a population of 20 total agents.

The random seeds we used: [10, 20, 30, 40, 50, 60, 70 , 80, 90, 100].


📚 Citation

If you use this codebase in your research, please cite the following article:

Differentiation of Behaviours in Learning Pheromone-based Communication
Authors: Davide Borghi, Stefano Mariani, and Franco Zambonelli
Under review for the 4th DISCOLI workshop on DIStributed COLlective Intelligence, 2025.

About

Differentiation of Behaviours in Learning Pheromone-based Communication

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages