Codebase Installation
If you're not yet invited to join as a collaborator to the "BFM" GitHub repository, please email me at zaho@mit.edu with your GitHub username. Check your email for an invitation!
Onboarding
Please see the video at this link for onboarding (requires an MIT zoom log in): ONBOARDING VIDEO (20min)
Setup Instructions
- Clone the codebase to Openmind. Create a directory in
/om2/user/<your_username>/<your_name>and work there.git clone https://github.com/azaho/bfm.git cd bfm -
Create a virtual environment and install the required packages:
python -m venv .venv source .venv/bin/activate pip install --upgrade pip pip install -e .[dev] -
Copy over the contents of
.env.exampleto.env. - If you're not on Openmind, you'll need to download the BrainTreebank dataset (~130 GB). If you're on Openmind, you are all set! The dataset is already available in the
/om2/user/<your_username>/braintreebankdirectory.- Download this script: braintreebank_download_extract.py from the Neuroprobe repository.
- Install necessary packages:
pip install beautifulsoup4 requests - Run the script to download and extract the dataset:
python braintreebank_download_extract.py - Update the
.envvariableBRAIN_TREEBANK_ROOT_DIRto point to the root directory of the BrainTreebank dataset on your machine.
Weights and Biases
We visualize our training runs using Weights and Biases (wandb).
Registration
Create an account with your educational email address at wandb.ai . Ensure you indicate that you are a student during registration to get free upgrade.
Once registered, download the CLI tool:
pip install wandb
Then, log in to the CLI using your credentials:
wandb login
After registration, you can log in to the wandb CLI, pasting your API key when prompted:
wandb login
Now, you specify the wandb_project in your training script. For example, to set the project name to "bfm":
python pretrain.py --training.setup_name andrii0 --cluster.wandb_project bfm
Running a Pretraining Job
Once you have set up the codebase and installed the required packages, you can start pretraining a model. This requires an A100 GPU. If you're on Openmind, you can request a node with an A100 GPU by following the instructions in the Openmind documentation.
python -m bfm.pretrain --training.setup_name andrii0 --cluster.cache_subjects 0 --cluster.eval_at_beginning 0