mdae package

Submodules

mdae.build_model module

class biobb_pytorch.mdae.build_model.BuildModel(input_stats_pt_path: str, output_model_pth_path: str | None = None, properties: dict | None = None, **kwargs)[source]

Bases: BiobbObject

biobb_pytorch BuildModel
Build a Molecular Dynamics AutoEncoder (MDAE) PyTorch model.
Builds a PyTorch autoencoder from the given properties.

Parameters:

input_stats_pt_path (str) – Path to the input model statistics file. File type: input. Sample file. Accepted formats: pt (edam:format_2333).
output_model_pth_path (str) (Optional) –
Path to save the model in .pth format. File type: output. Sample file. Accepted formats: pth (edam:format_2333).
properties (dict - Python dictionary object containing the tool parameters, not input/output files) –
- model_type (str) - (“AutoEncoder”) Name of the model class to instantiate (must exist in biobb_pytorch.mdae.models).
- n_cvs (int) - (1) Dimensionality of the latent space.
- encoder_layers (list) - ([16]) List of integers representing the number of neurons in each encoder layer.
- decoder_layers (list) - ([16]) List of integers representing the number of neurons in each decoder layer.
- options (dict) - ({“norm_in”: {“mode”: “min_max”}}) Additional options (e.g. norm_in, optimizer, loss_function, device, etc.).

Examples

This example shows how to use the BuildModel class to build a PyTorch autoencoder model:

from biobb_pytorch.mdae.build_model import build_model

input_stats_pt_path = "input_stats.pt"
output_model_pth_file = "model.pth"

n_features = 128
prop = {
    'model_type': 'AutoEncoder',
    'n_cvs': 10,
    'encoder_layers': [n_features, 64, 32],
    'decoder_layers': [32, 64, n_features],
    'options': {
        'norm_in': {"mode": "min_max"},
        'optimizer': {
            'lr': 1e-4
        }
    }
}

build_model(input_stats_pt_path=input_stats_pt_path,
           output_model_pth_path=None,
           properties=prop)

Info:

wrapped_software:
- name: PyTorch
- version: >=1.6.0
- license: BSD 3-Clause
ontology:
- name: EDAM
- schema: http://edamontology.org/EDAM.owl

launch() → int[source]: Execute the BuildModel object

static load_full(path: str) → Module[source]: Load a model serialized with save_full.

classmethod load_weights(props: Dict[str, Any], path: str) → BuildModel[source]: Instantiate from props and load state_dict from path.

save_full() → None[source]: Serialize the full model object (including architecture).

save_weights(path: str) → None[source]: Save model.state_dict() to the given path.

biobb_pytorch.mdae.build_model.build_model(properties: dict, input_stats_pt_path: str, output_model_pth_path: str | None = None, **kwargs) → int[source]

biobb_pytorch BuildModel
Build a Molecular Dynamics AutoEncoder (MDAE) PyTorch model.
Builds a PyTorch autoencoder from the given properties.

Parameters:

input_stats_pt_path (str) –
Path to the input model statistics file. File type: input. Sample file. Accepted formats: pt (edam:format_2333).
output_model_pth_path (str) (Optional) –
Path to save the model in .pth format. File type: output. Sample file. Accepted formats: pth (edam:format_2333).
properties (dict - Python dictionary object containing the tool parameters, not input/output files) –
- model_type (str) - (“AutoEncoder”) Name of the model class to instantiate (must exist in biobb_pytorch.mdae.models).
- n_cvs (int) - (1) Dimensionality of the latent space.
- encoder_layers (list) - ([16]) List of integers representing the number of neurons in each encoder layer.
- decoder_layers (list) - ([16]) List of integers representing the number of neurons in each decoder layer.
- options (dict) - ({“norm_in”: {“mode”: “min_max”}}) Additional options (e.g. norm_in, optimizer, loss_function, device, etc.).

Examples

This example shows how to use the BuildModel class to build a PyTorch autoencoder model:

from biobb_pytorch.mdae.build_model import build_model

input_stats_pt_path = "input_stats.pt"
output_model_pth_file = "model.pth"

n_features = 128
prop = {
    'model_type': 'AutoEncoder',
    'n_cvs': 10,
    'encoder_layers': [n_features, 64, 32],
    'decoder_layers': [32, 64, n_features],
    'options': {
        'norm_in': {"mode": "min_max"},
        'optimizer': {
            'lr': 1e-4
        }
    }
}

build_model(input_stats_pt_path=input_stats_pt_path,
           output_model_pth_path=None,
           properties=prop)

Info:

wrapped_software:
- name: PyTorch
- version: >=1.6.0
- license: BSD 3-Clause
ontology:
- name: EDAM
- schema: http://edamontology.org/EDAM.owl

mdae.mdfeaturizer module

class biobb_pytorch.mdae.mdfeaturizer.MDFeaturePipeline(input_topology_path: str, output_dataset_pt_path: str, output_stats_pt_path: str, properties: dict, input_trajectory_path: str | None = None, input_labels_npy_path: str | None = None, input_weights_npy_path: str | None = None, **kwargs)[source]

Bases: BiobbObject

biobb_pytorch mdfeaturizer
Obtain the Molecular Dynamics Features for PyTorch model training.
Obtain the Molecular Dynamics Features for PyTorch model training.

Parameters:

input_trajectory_path (str) (Optional) –
Path to the input trajectory file (if omitted topology file is used as trajectory). File type: input. Sample file. Accepted formats: xtc (edam:format_3875), dcd (edam:format_3878).
input_topology_path (str) –
Path to the input topology file. File type: input. Sample file. Accepted formats: pdb (edam:format_2333).
output_dataset_pt_path (str) –
Path to the output dataset model file. File type: output. Sample file. Accepted formats: pt (edam:format_2333).
output_stats_pt_path (str) –
Path to the output model statistics file. File type: output. Sample file. Accepted formats: pt (edam:format_2333).
properties (dict - Python dictionary object containing the tool parameters, not input/output files) –
- cartesian (dict) - ({“selection”: “name CA”}) Atom selection options for Cartesian coordinates feature generation (e.g. selection, fit_selection).
- distances (dict) - ({“selection”: “name CA”, “cutoff”: 0.4, “periodic”: True, “bonded”: False}) Atom selection options for pairwise distance features (selection, cutoff, periodic, bonded, etc.).
- angles (dict) - ({“selection”: “backbone”, “periodic”: True, “bonded”: True}) Atom selection options for angle features (selection, periodic, bonded, etc.).
- dihedrals (dict) - ({“selection”: “backbone”, “periodic”: True, “bonded”: True}) Atom selection options for dihedral features (selection, periodic, bonded, etc.).
- options (dict) - ({“norm_in”: {“mode”: “min_max”}}) General processing options (e.g. timelag, norm_in).

Examples

This is a use case of how to use the building block from Python:

from biobb_pytorch.mdae.MDFeaturePipeline import mdfeaturizer

prop = {
    'cartesian': {'selection': 'name CA'},
    'distances': {'selection': 'name CA',
                  'cutoff': 0.4,
                  'periodic': True,
                  'bonded': False},
    'angles': {'selection': 'backbone',
               'periodic': True,
               'bonded': True},
    'dihedrals': {'selection': 'backbone',
                  'periodic': True,
                  'bonded': True},
    'options': {'timelag': 10,
                'norm_in': {'mode': 'min_max'}
               }
}

mdfeaturizer(input_trajectory_path=trajectory_file,
             input_topology_path=topology_file,
             output_dataset_pt_path=output_file,
             output_stats_pt_path=output_stats_file,
             properties=prop)

Info:

wrapped_software:
- name: PyTorch
- version: >=1.6.0
- license: BSD 3-Clause
ontology:
- name: EDAM
- schema: http://edamontology.org/EDAM.owl

featurize_trajectory() → None[source]

launch() → int[source]: Execute the MDFeaturePipeline object

topology_indices() → Dict[str, Any][source]

biobb_pytorch.mdae.mdfeaturizer.mdfeaturizer(input_topology_path: str, output_dataset_pt_path: str, output_stats_pt_path: str, properties: dict, input_trajectory_path: str | None = None, input_labels_npy_path: str | None = None, input_weights_npy_path: str | None = None, **kwargs) → int[source]

biobb_pytorch mdfeaturizer
Obtain the Molecular Dynamics Features for PyTorch model training.
Obtain the Molecular Dynamics Features for PyTorch model training.

Parameters:

input_trajectory_path (str) (Optional) –
Path to the input trajectory file (if omitted topology file is used as trajectory). File type: input. Sample file. Accepted formats: xtc (edam:format_3875), dcd (edam:format_3878).
input_topology_path (str) –
Path to the input topology file. File type: input. Sample file. Accepted formats: pdb (edam:format_2333).
output_dataset_pt_path (str) –
Path to the output dataset model file. File type: output. Sample file. Accepted formats: pt (edam:format_2333).
output_stats_pt_path (str) –
Path to the output model statistics file. File type: output. Sample file. Accepted formats: pt (edam:format_2333).
properties (dict - Python dictionary object containing the tool parameters, not input/output files) –
- cartesian (dict) - ({“selection”: “name CA”}) Atom selection options for Cartesian coordinates feature generation (e.g. selection, fit_selection).
- distances (dict) - ({“selection”: “name CA”, “cutoff”: 0.4, “periodic”: True, “bonded”: False}) Atom selection options for pairwise distance features (selection, cutoff, periodic, bonded, etc.).
- angles (dict) - ({“selection”: “backbone”, “periodic”: True, “bonded”: True}) Atom selection options for angle features (selection, periodic, bonded, etc.).
- dihedrals (dict) - ({“selection”: “backbone”, “periodic”: True, “bonded”: True}) Atom selection options for dihedral features (selection, periodic, bonded, etc.).
- options (dict) - ({“norm_in”: {“mode”: “min_max”}}) General processing options (e.g. timelag, norm_in).

Examples

This is a use case of how to use the building block from Python:

from biobb_pytorch.mdae.MDFeaturePipeline import mdfeaturizer

prop = {
    'cartesian': {'selection': 'name CA'},
    'distances': {'selection': 'name CA',
                  'cutoff': 0.4,
                  'periodic': True,
                  'bonded': False},
    'angles': {'selection': 'backbone',
               'periodic': True,
               'bonded': True},
    'dihedrals': {'selection': 'backbone',
                  'periodic': True,
                  'bonded': True},
    'options': {'timelag': 10,
                'norm_in': {'mode': 'min_max'}
               }
}

mdfeaturizer(input_trajectory_path=trajectory_file,
             input_topology_path=topology_file,
             output_dataset_pt_path=output_file,
             output_stats_pt_path=output_stats_file,
             properties=prop)

Info:

wrapped_software:
- name: PyTorch
- version: >=1.6.0
- license: BSD 3-Clause
ontology:
- name: EDAM
- schema: http://edamontology.org/EDAM.owl

mdae.train_model module

class biobb_pytorch.mdae.train_model.TrainModel(input_model_pth_path: str, input_dataset_pt_path: str, output_model_pth_path: str | None = None, output_metrics_npz_path: str | None = None, properties: dict | None = None, **kwargs)[source]

Bases: BiobbObject

biobb_pytorch TrainModel
Trains a PyTorch autoencoder using the given properties.
Trains a PyTorch autoencoder using the given properties.

Parameters:

input_model_pth_path (str) –
Path to the input model file. File type: input. Sample file. Accepted formats: pth (edam:format_2333).
input_dataset_pt_path (str) –
Path to the input dataset file (.pt) produced by the MD feature pipeline. File type: input. Sample file. Accepted formats: pt (edam:format_2333).
output_model_pth_path (str) (Optional) –
Path to save the trained model (.pth). If omitted, the trained model is only available in memory. File type: output. Sample file. Accepted formats: pth (edam:format_2333).
output_metrics_npz_path (str) (Optional) –
Path save training metrics in compressed NumPy format (.npz). File type: output. Sample file. Accepted formats: npz (edam:format_2333).
properties (dict - Python dictionary object containing the tool parameters, not input/output files) –
- Trainer (dict) - ({}) PyTorch Lightning Trainer options (e.g. max_epochs, callbacks, logger, profiler, accelerator, devices, etc.).
- Dataset (dict) - ({}) mlcolvar DictDataset / DictModule options (e.g. batch_size, split proportions and shuffling flags).

Examples

This example shows how to use the TrainModel class to train a PyTorch autoencoder model:

from biobb_pytorch.mdae.train_model import train_model

input_model_pth_path='input_model.pth'
input_dataset_pt_path='input_dataset.pt'
output_model_pth_path='output_model.pth'
output_metrics_npz_path='output_metrics.npz'

prop={
    'Trainer': {
        'max_epochs': 10,
        'callbacks': {
            'metrics': ['EarlyStopping']
            }
        }
    },
    'Dataset': {
        'batch_size': 32,
        'split': {
            'train_prop': 0.8,
            'val_prop': 0.2
        }
    }
}

train_model(input_model_pth_path=input_model_pth_path,
                      input_dataset_pt_path=input_dataset_pt_path,
                      output_model_pth_path=None,
                      output_metrics_npz_path=None,
                      properties=prop)

Info:

wrapped_software:
- name: PyTorch
- version: >=1.6.0
- license: BSD 3-Clause
ontology:
- name: EDAM
- schema: http://edamontology.org/EDAM.owl

create_datamodule(dataset)[source]

fit_model(trainer, model, datamodule)[source]: Fit the model to the data, capturing logs and keeping tqdm clean.

get_callbacks()[source]

get_logger()[source]

get_profiler()[source]

get_trainer()[source]

launch() → int[source]: Execute the TrainModel object.

load_dataset()[source]

load_model()[source]

save_full(model) → None[source]: Serialize the full model object (including architecture).

biobb_pytorch.mdae.train_model.train_model(properties: dict, input_model_pth_path: str, input_dataset_pt_path: str, output_model_pth_path: str | None = None, output_metrics_npz_path: str | None = None, **kwargs) → int[source]

biobb_pytorch TrainModel
Trains a PyTorch autoencoder using the given properties.
Trains a PyTorch autoencoder using the given properties.

Parameters:

input_model_pth_path (str) –
Path to the input model file. File type: input. Sample file. Accepted formats: pth (edam:format_2333).
input_dataset_pt_path (str) –
Path to the input dataset file (.pt) produced by the MD feature pipeline. File type: input. Sample file. Accepted formats: pt (edam:format_2333).
output_model_pth_path (str) (Optional) –
Path to save the trained model (.pth). If omitted, the trained model is only available in memory. File type: output. Sample file. Accepted formats: pth (edam:format_2333).
output_metrics_npz_path (str) (Optional) –
Path save training metrics in compressed NumPy format (.npz). File type: output. Sample file. Accepted formats: npz (edam:format_2333).
properties (dict - Python dictionary object containing the tool parameters, not input/output files) –
- Trainer (dict) - ({}) PyTorch Lightning Trainer options (e.g. max_epochs, callbacks, logger, profiler, accelerator, devices, etc.).
- Dataset (dict) - ({}) mlcolvar DictDataset / DictModule options (e.g. batch_size, split proportions and shuffling flags).

Examples

This example shows how to use the TrainModel class to train a PyTorch autoencoder model:

from biobb_pytorch.mdae.train_model import train_model

input_model_pth_path='input_model.pth'
input_dataset_pt_path='input_dataset.pt'
output_model_pth_path='output_model.pth'
output_metrics_npz_path='output_metrics.npz'

prop={
    'Trainer': {
        'max_epochs': 10,
        'callbacks': {
            'metrics': ['EarlyStopping']
            }
        }
    },
    'Dataset': {
        'batch_size': 32,
        'split': {
            'train_prop': 0.8,
            'val_prop': 0.2
        }
    }
}

train_model(input_model_pth_path=input_model_pth_path,
                      input_dataset_pt_path=input_dataset_pt_path,
                      output_model_pth_path=None,
                      output_metrics_npz_path=None,
                      properties=prop)

Info:

wrapped_software:
- name: PyTorch
- version: >=1.6.0
- license: BSD 3-Clause
ontology:
- name: EDAM
- schema: http://edamontology.org/EDAM.owl

mdae.evaluate_model module

class biobb_pytorch.mdae.evaluate_model.EvaluateModel(input_model_pth_path: str, input_dataset_pt_path: str, output_results_npz_path: str, properties: dict, **kwargs)[source]

Bases: BiobbObject

biobb_pytorch EvaluateModel
Evaluate a Molecular Dynamics AutoEncoder (MDAE) PyTorch model.
Evaluates a PyTorch autoencoder from the given properties.

Parameters:

input_model_pth_path (str) –
Path to the trained model file. File type: input. Sample file. Accepted formats: pth (edam:format_2333).
input_dataset_pt_path (str) –
Path to the input dataset file (.pt) to evaluate on. File type: input. Sample file. Accepted formats: pt (edam:format_2333).
output_results_npz_path (str) –
Path to the output evaluation results file (compressed NumPy archive). File type: output. Sample file. Accepted formats: npz (edam:format_2333).
properties (dict - Python dictionary object containing the tool parameters, not input/output files) –
- Dataset (dict) - ({}) mlcolvar DictDataset / DataLoader options (e.g. batch_size, shuffle).

Examples

This example shows how to use the EvaluateModel class to evaluate a PyTorch autoencoder model:

from biobb_pytorch.mdae.evaluate_model import evaluate_model

input_model_pth_path='input_model.pth'
input_dataset_pt_path='input_dataset.pt'
output_results_npz_path='output_results.npz'

prop={
    'Dataset': {
        'batch_size': 32
    }
}

evaluate_model(input_model_pth_path=input_model.pth,
        input_dataset_pt_path=input_dataset.pt,
        output_results_npz_path=output_results.npz,
        properties=prop)

Info:

wrapped_software:
- name: PyTorch
- version: >=1.6.0
- license: BSD 3-Clause
ontology:
- name: EDAM
- schema: http://edamontology.org/EDAM.owl

create_dataloader(dataset)[source]

evaluate_decoder(model, dataloader)[source]: Evaluate the decoder part of the model.

evaluate_encoder(model, dataloader)[source]: Evaluate the encoder part of the model.

evaluate_full_model(model, dataloader)[source]: Evaluate the model on the data, computing average loss and collecting output variables.

launch() → int[source]: Execute the EvaluateModel object.

load_dataset()[source]

load_model()[source]

biobb_pytorch.mdae.evaluate_model.evaluate_model(properties: dict, input_model_pth_path: str, input_dataset_pt_path: str, output_results_npz_path: str, **kwargs) → int[source]

biobb_pytorch EvaluateModel
Evaluate a Molecular Dynamics AutoEncoder (MDAE) PyTorch model.
Evaluates a PyTorch autoencoder from the given properties.

Parameters:

input_model_pth_path (str) –
Path to the trained model file. File type: input. Sample file. Accepted formats: pth (edam:format_2333).
input_dataset_pt_path (str) –
Path to the input dataset file (.pt) to evaluate on. File type: input. Sample file. Accepted formats: pt (edam:format_2333).
output_results_npz_path (str) –
Path to the output evaluation results file (compressed NumPy archive). File type: output. Sample file. Accepted formats: npz (edam:format_2333).
properties (dict - Python dictionary object containing the tool parameters, not input/output files) –
- Dataset (dict) - ({}) mlcolvar DictDataset / DataLoader options (e.g. batch_size, shuffle).

Examples

This example shows how to use the EvaluateModel class to evaluate a PyTorch autoencoder model:

from biobb_pytorch.mdae.evaluate_model import evaluate_model

input_model_pth_path='input_model.pth'
input_dataset_pt_path='input_dataset.pt'
output_results_npz_path='output_results.npz'

prop={
    'Dataset': {
        'batch_size': 32
    }
}

evaluate_model(input_model_pth_path=input_model.pth,
        input_dataset_pt_path=input_dataset.pt,
        output_results_npz_path=output_results.npz,
        properties=prop)

Info:

wrapped_software:
- name: PyTorch
- version: >=1.6.0
- license: BSD 3-Clause
ontology:
- name: EDAM
- schema: http://edamontology.org/EDAM.owl

mdae.encode_model module

class biobb_pytorch.mdae.encode_model.EvaluateEncoder(input_model_pth_path: str, input_dataset_pt_path: str, output_results_npz_path: str, properties: dict, **kwargs)[source]

Bases: BiobbObject

biobb_pytorch EvaluateEncoder
Encode data with a Molecular Dynamics AutoEncoder (MDAE) model.
Evaluates a PyTorch autoencoder from the given properties.

Parameters:

input_model_pth_path (str) –
Path to the trained model file whose encoder will be used. File type: input. Sample file. Accepted formats: pth (edam:format_2333).
input_dataset_pt_path (str) –
Path to the input dataset file (.pt) to encode. File type: input. Sample file. Accepted formats: pt (edam:format_2333).
output_results_npz_path (str) –
Path to the output latent-space results file (compressed NumPy archive, typically containing ‘z’). File type: output. Sample file. Accepted formats: npz (edam:format_2333).
properties (dict - Python dictionary object containing the tool parameters, not input/output files) –
- Dataset (dict) - ({}) mlcolvar DictDataset / DataLoader options (e.g. batch_size, shuffle).

Examples

This example shows how to use the EvaluateEncoder class to evaluate a PyTorch autoencoder model:

from biobb_pytorch.mdae.evaluate_model import encode_model

input_model_pth_path='input_model.pth'
input_dataset_pt_path='input_dataset.npy'
output_results_npz_path='output_results.npz'

prop={
    'Dataset': {
        'batch_size': 32
    }
}

encode_model(input_model_pth_path=input_model.pth,
        input_dataset_pt_path=input_dataset.npy,
        output_results_npz_path=output_results.npz,
        properties=prop)

Info:

wrapped_software:
- name: PyTorch
- version: >=1.6.0
- license: BSD 3-Clause
ontology:
- name: EDAM
- schema: http://edamontology.org/EDAM.owl

create_dataloader(dataset)[source]

evaluate_encoder(model, dataloader)[source]: Evaluate the encoder part of the model.

launch() → int[source]: Execute the EvaluateEncoder object.

load_dataset()[source]

load_model()[source]

biobb_pytorch.mdae.encode_model.encode_model(properties: dict, input_model_pth_path: str, input_dataset_pt_path: str, output_results_npz_path: str, **kwargs) → int[source]

biobb_pytorch EvaluateEncoder
Encode data with a Molecular Dynamics AutoEncoder (MDAE) model.
Evaluates a PyTorch autoencoder from the given properties.

Parameters:

input_model_pth_path (str) –
Path to the trained model file whose encoder will be used. File type: input. Sample file. Accepted formats: pth (edam:format_2333).
input_dataset_pt_path (str) –
Path to the input dataset file (.pt) to encode. File type: input. Sample file. Accepted formats: pt (edam:format_2333).
output_results_npz_path (str) –
Path to the output latent-space results file (compressed NumPy archive, typically containing ‘z’). File type: output. Sample file. Accepted formats: npz (edam:format_2333).
properties (dict - Python dictionary object containing the tool parameters, not input/output files) –
- Dataset (dict) - ({}) mlcolvar DictDataset / DataLoader options (e.g. batch_size, shuffle).

Examples

This example shows how to use the EvaluateEncoder class to evaluate a PyTorch autoencoder model:

from biobb_pytorch.mdae.evaluate_model import encode_model

input_model_pth_path='input_model.pth'
input_dataset_pt_path='input_dataset.npy'
output_results_npz_path='output_results.npz'

prop={
    'Dataset': {
        'batch_size': 32
    }
}

encode_model(input_model_pth_path=input_model.pth,
        input_dataset_pt_path=input_dataset.npy,
        output_results_npz_path=output_results.npz,
        properties=prop)

Info:

wrapped_software:
- name: PyTorch
- version: >=1.6.0
- license: BSD 3-Clause
ontology:
- name: EDAM
- schema: http://edamontology.org/EDAM.owl

mdae.decode_model module

class biobb_pytorch.mdae.decode_model.EvaluateDecoder(input_model_pth_path: str, input_dataset_npy_path: str, output_results_npz_path: str, properties: dict, **kwargs)[source]

Bases: BiobbObject

biobb_pytorch decode_model
Evaluates a PyTorch autoencoder from the given properties.
Evaluates a PyTorch autoencoder from the given properties.

Parameters:

input_model_pth_path (str) –
Path to the trained model file whose decoder will be used. File type: input. Sample file. Accepted formats: pth (edam:format_2333).
input_dataset_npy_path (str) –
Path to the input latent variables file in NumPy format (e.g. encoded ‘z’). File type: input. Sample file. Accepted formats: npy (edam:format_2333).
output_results_npz_path (str) –
Path to the output reconstructed data file (compressed NumPy archive, typically containing ‘xhat’). File type: output. Sample file. Accepted formats: npz (edam:format_2333).
properties (dict - Python dictionary object containing the tool parameters, not input/output files) –
- Dataset (dict) - ({}) DataLoader options (e.g. batch_size, shuffle) for batching the latent variables.

Examples

This example shows how to use the EvaluateDecoder class to evaluate a PyTorch autoencoder model:

from biobb_pytorch.mdae.decode_model import decode_model

input_model_pth_path='input_model.pth'
input_dataset_npy_path='input_dataset.npy'
output_results_npz_path='output_results.npz'

prop={
    'Dataset': {
        'batch_size': 32
    }
}

decode_model(input_model_pth_path=input_model.pth,
        input_dataset_npy_path=input_dataset.npy,
        output_results_npz_path=output_results.npz,
        properties=prop)

Info:

wrapped_software:
- name: PyTorch
- version: >=1.6.0
- license: BSD 3-Clause
ontology:
- name: EDAM
- schema: http://edamontology.org/EDAM.owl

create_dataloader(dataset)[source]

evaluate_decoder(model, dataloader)[source]: Evaluate the decoder part of the model.

launch() → int[source]: Execute the EvaluateDecoder object.

load_dataset()[source]

load_model()[source]

biobb_pytorch.mdae.decode_model.decode_model(properties: dict, input_model_pth_path: str, input_dataset_npy_path: str, output_results_npz_path: str, **kwargs) → int[source]

biobb_pytorch decode_model
Evaluates a PyTorch autoencoder from the given properties.
Evaluates a PyTorch autoencoder from the given properties.

Parameters:

input_model_pth_path (str) –
Path to the trained model file whose decoder will be used. File type: input. Sample file. Accepted formats: pth (edam:format_2333).
input_dataset_npy_path (str) –
Path to the input latent variables file in NumPy format (e.g. encoded ‘z’). File type: input. Sample file. Accepted formats: npy (edam:format_2333).
output_results_npz_path (str) –
Path to the output reconstructed data file (compressed NumPy archive, typically containing ‘xhat’). File type: output. Sample file. Accepted formats: npz (edam:format_2333).
properties (dict - Python dictionary object containing the tool parameters, not input/output files) –
- Dataset (dict) - ({}) DataLoader options (e.g. batch_size, shuffle) for batching the latent variables.

Examples

This example shows how to use the EvaluateDecoder class to evaluate a PyTorch autoencoder model:

from biobb_pytorch.mdae.decode_model import decode_model

input_model_pth_path='input_model.pth'
input_dataset_npy_path='input_dataset.npy'
output_results_npz_path='output_results.npz'

prop={
    'Dataset': {
        'batch_size': 32
    }
}

decode_model(input_model_pth_path=input_model.pth,
        input_dataset_npy_path=input_dataset.npy,
        output_results_npz_path=output_results.npz,
        properties=prop)

Info:

wrapped_software:
- name: PyTorch
- version: >=1.6.0
- license: BSD 3-Clause
ontology:
- name: EDAM
- schema: http://edamontology.org/EDAM.owl

mdae.make_plumed module

class biobb_pytorch.mdae.make_plumed.GeneratePlumed(input_model_pth_path: str, input_stats_pt_path: str | None = None, input_reference_pdb_path: str | None = None, input_ndx_path: str | None = None, output_plumed_dat_path: str = 'plumed.dat', output_features_dat_path: str = 'features.dat', output_model_ptc_path: str = 'model.ptc', properties: Dict[str, Any] | None = None, **kwargs)[source]

Bases: BiobbObject

biobb_plumed GeneratePlumed
Generate PLUMED input for biased dynamics using an MDAE model.
Generates a PLUMED input file, features.dat, and converts the model to .ptc format.

Parameters:

input_model_pth_path (str) – Path to the trained PyTorch model (.pth) to be converted to TorchScript and used in PLUMED. File type: input. Accepted formats: pth (edam:format_2333).
input_stats_pt_path (str) (Optional) – Path to statistics file (.pt) produced during featurization, used to derive the PLUMED features.dat content. File type: input. Accepted formats: pt (edam:format_2333).
input_reference_pdb_path (str) (Optional) – Path to reference PDB used for FIT_TO_TEMPLATE actions when Cartesian features are present. File type: input. Accepted formats: pdb (edam:format_1476).
input_ndx_path (str) (Optional) – Path to GROMACS index (NDX) file used to define groups when required by PLUMED. File type: input. Accepted formats: ndx (edam:format_2033).
output_plumed_dat_path (str) – Path to the output PLUMED input file. File type: output. Accepted formats: dat (edam:format_2330).
output_features_dat_path (str) – Path to the output features.dat file describing the CVs to PLUMED. File type: output. Accepted formats: dat (edam:format_2330).
output_model_ptc_path (str) – Path to the output TorchScript model file (.ptc) for PLUMED’s PYTORCH_MODEL action. File type: output. Accepted formats: ptc (edam:format_2333).
properties (dict - Python dictionary object containing the tool parameters, not input/output files) –
- include_energy (bool) - (True) Whether to include ENERGY in PLUMED.
- bias (list) - ([]) List of biasing actions (e.g. METAD) to be added to the PLUMED file.
- prints (dict) - ({“ARG”: “*”, “STRIDE”: 1, “FILE”: “COLVAR”}) PRINT command parameters (e.g. ARG, STRIDE, FILE).
- group (dict) - (None) GROUP definition options (label, NDX group or atom selection parameters).
- wholemolecules (dict) - (None) WHOLEMOLECULES options when using Cartesian coordinates.
- fit_to_template (dict) - (None) FIT_TO_TEMPLATE options (e.g. STRIDE, TYPE, etc.).
- pytorch_model (dict) - (None) PYTORCH_MODEL options (label, PACE and other parameters).

Examples

This example shows how to use the GeneratePlumed class to generate a PLUMED input file for biased dynamics using an MDAE model:

from biobb_plumed.generate_plumed import make_plumed

prop = {
    "additional_actions": [
        {
            "name": "ENERGY",
            "label": "ene"
        },
        {
            "name": "RMSD",
            "label": "rmsd",
            "params": {
                "TYPE": "OPTIMAL"
            }
        }
    ],
    "group": {
        "label": "c_alphas",
        "NDX_GROUP": "chA_&_C-alpha"
    },
    "wholemolecules": {
        "ENTITY0": "c_alphas"
    },
    "fit_to_template": {
        "STRIDE": 1,
        "TYPE": "OPTIMAL"
    },
    "pytorch_model": {
        "label": "cv",
        "PACE": 1
    },
    "bias": [
        {
            "name": "METAD",
            "label": "bias",
            "params": {
                "ARG": "cv.1",
                "PACE": 500,
                "HEIGHT": 1.2,
                "SIGMA": 0.35,
                "FILE": "HILLS",
                "BIASFACTOR": 8
            }
        }
    ],
    "prints": {
        "ARG": "cv.*,bias.*",
        "STRIDE": 1,
        "FILE": "COLVAR"
    }
}

make_plumed(
    input_model_pth_path="model.pth",
    input_stats_pt_path="stats.pt",
    output_plumed_dat_path="plumed.dat",
    output_features_dat_path="features.dat",
    output_model_ptc_path="model.ptc",
    properties=prop
)

Info:

wrapped_software:
- name: PLUMED with PyTorch
- version: >=2.0
- license: LGPL 3.0
ontology:
- name: EDAM
- schema: http://edamontology.org/EDAM.owl

launch() → int[source]: Execute the GeneratePlumed object.

biobb_pytorch.mdae.make_plumed.make_plumed(input_model_pth_path: str, input_stats_pt_path: str | None = None, input_reference_pdb_path: str | None = None, input_ndx_path: str | None = None, output_plumed_dat_path: str = 'plumed.dat', output_features_dat_path: str = 'features.dat', output_model_ptc_path: str = 'model.ptc', properties: Dict[str, Any] | None = None, **kwargs) → int[source]

biobb_plumed GeneratePlumed
Generate PLUMED input for biased dynamics using an MDAE model.
Generates a PLUMED input file, features.dat, and converts the model to .ptc format.

Parameters:

input_model_pth_path (str) – Path to the trained PyTorch model (.pth) to be converted to TorchScript and used in PLUMED. File type: input. Accepted formats: pth (edam:format_2333).
input_stats_pt_path (str) (Optional) – Path to statistics file (.pt) produced during featurization, used to derive the PLUMED features.dat content. File type: input. Accepted formats: pt (edam:format_2333).
input_reference_pdb_path (str) (Optional) – Path to reference PDB used for FIT_TO_TEMPLATE actions when Cartesian features are present. File type: input. Accepted formats: pdb (edam:format_1476).
input_ndx_path (str) (Optional) – Path to GROMACS index (NDX) file used to define groups when required by PLUMED. File type: input. Accepted formats: ndx (edam:format_2033).
output_plumed_dat_path (str) – Path to the output PLUMED input file. File type: output. Accepted formats: dat (edam:format_2330).
output_features_dat_path (str) – Path to the output features.dat file describing the CVs to PLUMED. File type: output. Accepted formats: dat (edam:format_2330).
output_model_ptc_path (str) – Path to the output TorchScript model file (.ptc) for PLUMED’s PYTORCH_MODEL action. File type: output. Accepted formats: ptc (edam:format_2333).
properties (dict - Python dictionary object containing the tool parameters, not input/output files) –
- include_energy (bool) - (True) Whether to include ENERGY in PLUMED.
- bias (list) - ([]) List of biasing actions (e.g. METAD) to be added to the PLUMED file.
- prints (dict) - ({“ARG”: “*”, “STRIDE”: 1, “FILE”: “COLVAR”}) PRINT command parameters (e.g. ARG, STRIDE, FILE).
- group (dict) - (None) GROUP definition options (label, NDX group or atom selection parameters).
- wholemolecules (dict) - (None) WHOLEMOLECULES options when using Cartesian coordinates.
- fit_to_template (dict) - (None) FIT_TO_TEMPLATE options (e.g. STRIDE, TYPE, etc.).
- pytorch_model (dict) - (None) PYTORCH_MODEL options (label, PACE and other parameters).

Examples

This example shows how to use the GeneratePlumed class to generate a PLUMED input file for biased dynamics using an MDAE model:

from biobb_plumed.generate_plumed import make_plumed

prop = {
    "additional_actions": [
        {
            "name": "ENERGY",
            "label": "ene"
        },
        {
            "name": "RMSD",
            "label": "rmsd",
            "params": {
                "TYPE": "OPTIMAL"
            }
        }
    ],
    "group": {
        "label": "c_alphas",
        "NDX_GROUP": "chA_&_C-alpha"
    },
    "wholemolecules": {
        "ENTITY0": "c_alphas"
    },
    "fit_to_template": {
        "STRIDE": 1,
        "TYPE": "OPTIMAL"
    },
    "pytorch_model": {
        "label": "cv",
        "PACE": 1
    },
    "bias": [
        {
            "name": "METAD",
            "label": "bias",
            "params": {
                "ARG": "cv.1",
                "PACE": 500,
                "HEIGHT": 1.2,
                "SIGMA": 0.35,
                "FILE": "HILLS",
                "BIASFACTOR": 8
            }
        }
    ],
    "prints": {
        "ARG": "cv.*,bias.*",
        "STRIDE": 1,
        "FILE": "COLVAR"
    }
}

make_plumed(
    input_model_pth_path="model.pth",
    input_stats_pt_path="stats.pt",
    output_plumed_dat_path="plumed.dat",
    output_features_dat_path="features.dat",
    output_model_ptc_path="model.ptc",
    properties=prop
)

Info:

wrapped_software:
- name: PLUMED with PyTorch
- version: >=2.0
- license: LGPL 3.0
ontology:
- name: EDAM
- schema: http://edamontology.org/EDAM.owl

mdae.feat2traj module

class biobb_pytorch.mdae.feat2traj.Feat2Traj(input_results_npz_path: str, input_stats_pt_path: str, input_topology_path: str | None = None, output_traj_path: str | None = None, output_top_path: str | None = None, properties: dict | None = None, **kwargs)[source]

Bases: BiobbObject

biobb_pytorch Feat2Traj
Converts a .pt file (features) to a trajectory using cartesian indices and topology from the stats file.
Converts a .pt file (features) to a trajectory using cartesian indices and topology from the stats file.

Parameters:

input_results_npz_path (str) –
Path to the input reconstructed results file (.npz), typically containing an ‘xhat’ array. File type: input. Sample file. Accepted formats: npz (edam:format_2333).
input_stats_pt_path (str) –
Path to the input model statistics file (.pt) containing cartesian indices and optionally topology. File type: input. Sample file. Accepted formats: pt (edam:format_2333).
input_topology_path (str) (optional) –
Path to the topology file (PDB) used if no suitable topology is found in the stats file. Used if no topology is found in stats. File type: input. Sample file. Accepted formats: pdb (edam:format_1476).
output_traj_path (str) –
Path to save the trajectory in xtc/pdb/dcd format. File type: output. Sample file. Accepted formats: xtc (edam:format_3875), pdb (edam:format_1476), dcd (edam:format_3878).
output_top_path (str) (optional) –
Path to save the output topology file (pdb). Used if trajectory format requires separate topology. File type: output. Sample file. Accepted formats: pdb (edam:format_1476).
properties (dict - Python dictionary object containing the tool parameters, not input/output files) –
- restart (bool) - (False) [WF property] Do not execute if output files exist.

Examples

This example shows how to use the Feat2Traj class to convert a .pt file (features) to a trajectory using cartesian indices and topology from the stats file:

from biobb_pytorch.mdae.feat2traj import feat2traj

input_results_npz_path='input_results.npz'
input_stats_pt_path='input_model.pt'
input_topology_path='input_topology.pdb'
output_traj_path='output_model.xtc'
output_top_path='output_model.pdb'

prop={}

feat2traj(input_results_npz_path=input_results_npz_path,
        input_stats_pt_path=input_stats_pt_path,
        input_topology_path=input_topology_path,
        output_traj_path=output_traj_path,
        output_top_path=output_top_path,
        properties=prop)

Info:

wrapped_software:
- name: PyTorch
- version: >=1.6.0
- license: BSD 3-Clause
ontology:
- name: EDAM
- schema: http://edamontology.org/EDAM.owl

launch() → int[source]: Execute the Feat2Traj class and its .launch() method.

biobb_pytorch.mdae.feat2traj.feat2traj(input_results_npz_path: str, input_stats_pt_path: str, input_topology_path: str | None = None, output_traj_path: str | None = None, output_top_path: str | None = None, properties: dict | None = None, **kwargs) → int[source]

biobb_pytorch Feat2Traj
Converts a .pt file (features) to a trajectory using cartesian indices and topology from the stats file.
Converts a .pt file (features) to a trajectory using cartesian indices and topology from the stats file.

Parameters:

input_results_npz_path (str) –
Path to the input reconstructed results file (.npz), typically containing an ‘xhat’ array. File type: input. Sample file. Accepted formats: npz (edam:format_2333).
input_stats_pt_path (str) –
Path to the input model statistics file (.pt) containing cartesian indices and optionally topology. File type: input. Sample file. Accepted formats: pt (edam:format_2333).
input_topology_path (str) (optional) –
Path to the topology file (PDB) used if no suitable topology is found in the stats file. Used if no topology is found in stats. File type: input. Sample file. Accepted formats: pdb (edam:format_1476).
output_traj_path (str) –
Path to save the trajectory in xtc/pdb/dcd format. File type: output. Sample file. Accepted formats: xtc (edam:format_3875), pdb (edam:format_1476), dcd (edam:format_3878).
output_top_path (str) (optional) –
Path to save the output topology file (pdb). Used if trajectory format requires separate topology. File type: output. Sample file. Accepted formats: pdb (edam:format_1476).
properties (dict - Python dictionary object containing the tool parameters, not input/output files) –
- restart (bool) - (False) [WF property] Do not execute if output files exist.

Examples

This example shows how to use the Feat2Traj class to convert a .pt file (features) to a trajectory using cartesian indices and topology from the stats file:

from biobb_pytorch.mdae.feat2traj import feat2traj

input_results_npz_path='input_results.npz'
input_stats_pt_path='input_model.pt'
input_topology_path='input_topology.pdb'
output_traj_path='output_model.xtc'
output_top_path='output_model.pdb'

prop={}

feat2traj(input_results_npz_path=input_results_npz_path,
        input_stats_pt_path=input_stats_pt_path,
        input_topology_path=input_topology_path,
        output_traj_path=output_traj_path,
        output_top_path=output_top_path,
        properties=prop)

Info:

wrapped_software:
- name: PyTorch
- version: >=1.6.0
- license: BSD 3-Clause
ontology:
- name: EDAM
- schema: http://edamontology.org/EDAM.owl

mdae.explainability.LRP module

class biobb_pytorch.mdae.explainability.LRP.LRP(input_model_pth_path: str, input_dataset_pt_path: str, output_results_npz_path: str | None = None, properties: dict | None = None, **kwargs)[source]

Bases: BiobbObject

biobb_pytorch LRP
Performs Layer-wise Relevance Propagation on a trained autoencoder encoder.
Performs Layer-wise Relevance Propagation on a trained autoencoder encoder.

Parameters:

input_model_pth_path (str) –
Path to the trained model file whose encoder is analyzed. File type: input. Sample file. Accepted formats: pth (edam:format_2333).
input_dataset_pt_path (str) –
Path to the input dataset file (.pt) used for computing relevance scores. File type: input. Sample file. Accepted formats: pt (edam:format_2333).
output_results_npz_path (str) (Optional) –
Path to the output results file containing relevance scores (compressed NumPy archive). File type: output. Sample file. Accepted formats: npz (edam:format_2333).
properties (dict - Python dictionary object containing the tool parameters, not input/output files) –
- Dataset (dict) - ({}) Dataset/DataLoader options (e.g. batch_size and optional indices to subset the dataset).

Examples

This example shows how to use the LRP class to perform Layer-wise Relevance Propagation:

from biobb_pytorch.mdae.explainability import relevancePropagation

input_model_pth_path='input_model.pth'
input_dataset_pt_path='input_dataset.pt'
output_results_npz_path='output_results.npz'

prop={
    'Dataset': {
        'batch_size': 32
    }
}

LRP(input_model_pth_path=input_model_pth_path,
               input_dataset_pt_path=input_dataset_pt_path,
               output_results_npz_path=None,
               properties=prop)

Info:

wrapped_software:
- name: PyTorch
- version: >=1.6.0
- license: BSD 3-Clause
ontology:
- name: EDAM
- schema: http://edamontology.org/EDAM.owl

compute_global_importance(model, dataloader, latent_index=None)[source]

create_dataloader(dataset)[source]

launch() → int[source]: Execute the LRP class and its .launch() method.

load_dataset()[source]

load_model()[source]

mask_idx(dataset: dict, indices: ndarray) → dict[source]: Mask the dataset (dict) for all keys.

biobb_pytorch.mdae.explainability.LRP.relevance_propagation(properties: dict, input_model_pth_path: str, input_dataset_pt_path: str, output_results_npz_path: str | None = None, **kwargs) → int[source]

biobb_pytorch LRP
Performs Layer-wise Relevance Propagation on a trained autoencoder encoder.
Performs Layer-wise Relevance Propagation on a trained autoencoder encoder.

Parameters:

input_model_pth_path (str) –
Path to the trained model file whose encoder is analyzed. File type: input. Sample file. Accepted formats: pth (edam:format_2333).
input_dataset_pt_path (str) –
Path to the input dataset file (.pt) used for computing relevance scores. File type: input. Sample file. Accepted formats: pt (edam:format_2333).
output_results_npz_path (str) (Optional) –
Path to the output results file containing relevance scores (compressed NumPy archive). File type: output. Sample file. Accepted formats: npz (edam:format_2333).
properties (dict - Python dictionary object containing the tool parameters, not input/output files) –
- Dataset (dict) - ({}) Dataset/DataLoader options (e.g. batch_size and optional indices to subset the dataset).

Examples

This example shows how to use the LRP class to perform Layer-wise Relevance Propagation:

from biobb_pytorch.mdae.explainability import relevancePropagation

input_model_pth_path='input_model.pth'
input_dataset_pt_path='input_dataset.pt'
output_results_npz_path='output_results.npz'

prop={
    'Dataset': {
        'batch_size': 32
    }
}

LRP(input_model_pth_path=input_model_pth_path,
               input_dataset_pt_path=input_dataset_pt_path,
               output_results_npz_path=None,
               properties=prop)

Info:

wrapped_software:
- name: PyTorch
- version: >=1.6.0
- license: BSD 3-Clause
ontology:
- name: EDAM
- schema: http://edamontology.org/EDAM.owl