# BPNeural calculator

BPNeural is a Python module integrated with Fortran, which is based on the theory proposed by Behler and Parrinello. In the Behler-Parrinello method, "symmetry functions" (described below as invariant representations) representing the chemical environment of each atom in the system are taken as inputs.


Brief introduction

Working with the calculator

You first import and train the calculator, which can be as simple as

from neural.bp import BPNeural

calc = BPNeural(label='mycalc')

In this example, it is training to a set of images contained in the file 'path/to/trainingdata.traj'. These images would have been created by a full-accuracy calculation; e.g., by a molecular dynamics simulation with the GPAW calculator.

The trained calculator can now be used just like any other ASE calculator. E.g.,

atoms = ...
energy = atoms.get_potential_energy()

The calculator also saves the trained parameters (by default to 'trained-parameters.json'). You can re-load the calculator in a future script with

calc = BPNeural(load='mycalc')

Detailed parameters


The calculator is initiated with the following arguments, shown with their default values:

from neural.bp import BPNeural

calc = BPNeural(load=None, 

Here is a detailed list of all keywords for the BPNeural calculator:

Keyword Type Default value Description
load str None Load an existing (trained) BPNeural calculator from this path.
label str None Name used for all output files.
dblabel str None Name used for all database output files. If not specified, will take label.
cutoff float 6.5 Radius above which neighbor interactions are ignored.
Gs dict None Dictionary of symbols and lists of dictionaries for making symmetry functions. Should be given in the form: Gs = {"O": [..., ...], "Au": [..., ...]} where ... is, for example, {"type":"G2", "element":"O", "eta":0.0009}, or {"type":"G4", "elements":["O", "Au"], "eta":0.0001,,"gamma":0.1, "zeta":1.0}. If not given, it will be automatically generated.
hiddenlayers dict None Dictionary of chemical element symbols and architectures of their corresponding hidden layers of the conventional neural network. Note that for each atom, number of nodes in the input layer is always equal to the number of symmetry functions (Gs) and the number of nodes in the output layer is always one. For example, hiddenlayers = {"O":(3,5), "Au":(5,6)} means that for "O", we have two hidden layers, the first one with three nodes, and the second one having five nodes.
activation str 'tanh' To assign the type of activation function. "tanh" refers to tanh function, and "sigmoid" refers to sigmoid function.
weights dict None Dictionary of symbols and dictionaries of arrays of weights; each symbol and related dictionary corresponds to one chemical element and one conventional neural network. The weights dictionary for the above example is weights = {"O": {1: array(148,3), 2: array(4,5), 3: array(6,1)}, "Au": {1: array(148,5), 2: array(6,6), 3:array(7,1)}. The arrays are set up to connect node i in the previous layer with node j in the current layer with indices w[i,j]. There are n+1 rows (i values) with the last row being the bias, and m columns (j values). If weights is not given, the arrays will be randomly generated such that the arguments of the activation functions are in the range -1.5 to 1.5.
scalings dict None Dictionary of parameters for slope and intercept for each element. They are used in order to remove the final value from the range that the activation function would otherwise be locked in. For example, scalings={"Pd":{"slope":2, "intercept":1}, "O":{"slope":3, "intercept":4}}.
fingerprints_range dict None Range of fingerprints of each chemical species. Should be fed as a dictionary of chemical species and a list of minimum and maximun, e.g. fingerprints_range={"Pd": [0.31, 0.59], "O":[0.56, 0.72]}.
extrapolate bool True If True, allows for extrapolation, if False, does not allow.
fortran bool True If True, will use the fortran subroutines, else will not.

Starting from old parameters

The parameters of the calculator are saved after training so that the calculator can be re-established for future calculations. If the previously trained parameter file is named 'old.json', it can be introduced to the Neural to take those values with something like this.

calc = Neural(load='old.json', label='new')

The label ('new') is used as a prefix for any output from use of this calculator. In general, the calculator is written to not overwrite your old settings, but it is a good idea to have a different name for this label to avoid accidentally overwriting your carefully trained calculator's files!

Training the calculator

Training a new calculator instance occurs with the train method, which is given a list of training images and a desired value of root mean square error (rmse) per image per atom as below:

calc.train(images, energy_goal=0.001, force_goal=0.005)

Calling this method will generate output files where you can watch the progress. Note that this is in general a computationally-intensive process!

The two parameters are described in more detail in the following table:

Keyword Type Default value Description
images list - List of ASE atoms objects with positions, symbols, and energies in ASE form; this is the training set of data. Energies can be obtained from any reference, e.g. DFT calculations. This list can be provided in three possible forms: a list (or list-like object) of ASE images, a path to an ASE trajectory file, or a path to an ASE db file.
energy_goal float 0.001 Threshold rms(energy per atom) error at which simulation is converged. The default value is in unit of eV.
force_goal float 0.005 Threshold rmse per atom for forces at which simulation is converged. The default value is in unit of eV/Ang. If 'force_goal = None', forces will not be trained.
train_forces bool True If "True", forces will be trained as well.
overfitting_constraint float 0. The constant to suppress overfitting. A proper value for this constant is subtle and depends on the train data set; a small value may not cure the over-fitting issue, whereas a large value may cause over-smoothness.
force_coefficient float None Multiplier of force square error in constructing the cost function. This controls how significant force-fit is as compared to energy fit in approaching convergence. It depends on force and energy units. If not specified, guesses the value such that energy and force contribute equally to the cost function when they hit their converged values.
cores int None Number of cores to parallelize over. If not specified, attempts to determine from environment.
optimizer function scipy.optimize.fmin_bfgs The optimization object. The default is to use scipy's fmin_bfgs, but any optimizer that behaves in the same way will do.
read_fingerprints bool True Determines whether or not the code should read fingerprints already calculated and saved in the script directory.
overwrite bool False If a trained output file with the same name exists, overwrite it.

In general, we plan to make the ASE database the recommended data type. However, this is a 'work in progress' over at ASE.

Growing your training set

Say you have a nicely trained calculator that you built around a training set of 10,000 images, but you decide you want to add another 1,000 images to the training set. Rather than starting training from scratch, it can be faster to train from the previous optimum. This is fairly simple and can be accomlished as

calc = Neural(load='old_calc', label='new_calc')

In this case, all_images contains all 11,000 images (the old set and the new set).

If you are training on a large set of images, a building-up strategy can be effective. For example, to train on 100,000 images, you could first train on 10,000 images, then add 10,000 images to the set and re-train from the previous parametrs, etc. If the images are nicely randomized, this can give you some confidence that you are not training inside too shallow of a basin. The first images set, which you are starting from, had better be representative of the verity of all images you will add later.

Parallel computing

As the number of either training images or the number of atoms in each image increases, serial computing takes a long time, and thus, resorting to parallel computing becomes inevitable. Two common approaches in parallel computing are either to parallelize over multiple threads or to parallelize over multiple processes. In the former approach, threads usually have access to the same memory area, which can lead to conflicts in case of improper synchronization. But in the second approach, completely separate memory locations are allocated to each spawned process which avoids any possible interference between sub-tasks.

Currently, BPNeural calculator takes the second approach via multiprocessing, a built-in module in Python 2.6 (and above) standard library. For details about multiprocessing module, the reader is directed to the official documentation. Only local concurrency (multi-computing over local cores) is possible in the current release of BPNeural code (It is planned that remote concurrency (multi-computing over different machines) also becomes possible in the future releases). The code starts with serial computing, but when arriving to calculating atomic “fingerprints”, multiple processes are spawned each doing a sub-task on part of images, and putting the results back in multiprocessing Queues. All the calculated values of Queues are then gathered at the end and sent to the next step. A similar scheme of sharing images between multiple cores happens when the neural network is feeding forward as well as when the corresponding error function is propagated backward.

The code automatically finds the number of available cores. A schematic of how multiprocessing in designed in BPNeural is shown below:


Complete example

Example with cross-validation

The script below shows an example of training the neural network calculator. This script first creates some sample data using a molecular dynamics simulation with the cheap EMT calculator in ASE. The images are then randomly divided into "training" and "test" data, such that cross-validation can be performed. The network is trained on the training data. After this, the predictions of the machine-learning algorithm are compared to the parent data for both the training and test set. This is plotted in a parity plot, which is saved to disk.

from ase.lattice.surface import fcc110
from ase import Atom, Atoms
from ase.constraints import FixAtoms
from ase.calculators.emt import EMT
from ase.md import VelocityVerlet
from ase.md.velocitydistribution import MaxwellBoltzmannDistribution
from ase import units
from ase import io

from neural.utilities import randomize_images
from neural.bp import BPNeural


def testBP():

    # Generate atomic system to create test data.
    atoms = fcc110('Pt', (2, 2, 2), vacuum=7.)
    adsorbate = Atoms([Atom('Cu', atoms[7].position + (0., 0., 2.)),
                       Atom('Cu', atoms[7].position + (0., 0., 5.))])
    atoms.set_constraint(FixAtoms(indices=[0, 2]))
    calc = EMT()  # cheap calculator

    # Run some molecular dynamics to generate data.
    trajectory = io.Trajectory('data.traj', 'w', atoms=atoms)
    MaxwellBoltzmannDistribution(atoms, temp=300. * units.kB)
    dynamics = VelocityVerlet(atoms, dt=1. * units.fs)
    for step in range(50):

    # Train the calculator.
    train_images, test_images = randomize_images('data.traj')

    calc = BPNeural()
    calc.train(train_images, force_goal=None)

    # Test the predictions.
    from matplotlib import pyplot

    fig, ax = pyplot.subplots()

    for image in train_images:
        actual_energy = image.get_potential_energy()
        predicted_energy = calc.get_potential_energy(image)
        ax.plot(actual_energy, predicted_energy, 'b.')

    for image in test_images:
        actual_energy = image.get_potential_energy()
        predicted_energy = calc.get_potential_energy(image)
        ax.plot(actual_energy, predicted_energy, 'r.')

    ax.set_xlabel('Actual energy, eV')
    ax.set_ylabel('Predicted energy, eV')



if __name__ == '__main__':

Note that most of the script is just creating the training set or analyzing the results -- the actual neural-network creation and training takes place on just two lines in the middle of the script.

The figure that was produced is shown below. As both the generated data and the initial guess of the training parameters are random, this will look different each time this script is run.