pyiron_contrib.atomistics.atomistics.job.trainingcontainer module
Store structures together with energies and forces for potential fitting applications.
Basic usage:
>>> pr = Project("training")
>>> container = pr.create.job.TrainingContainer("small_structures")
Let’s make a structure and invent some forces
>>> structure = pr.create.structure.ase_bulk("Fe")
>>> forces = numpy.array([-1, 1, -1])
>>> container.add_structure(structure, energy=-1.234, forces=forces, identifier="Fe_bcc")
If you have a lot of precomputed structures you may also add them in bulk from a pandas DataFrame
>>> df = pandas.DataFrame({ "name": "Fe_bcc", "atoms": structure, "energy": -1.234, "forces": forces })
>>> container.include_dataset(df)
You can retrieve the full database with :method:`~.TrainingContainer.to_pandas()` like this
>>> container.to_pandas()
name atoms energy forces number_of_atoms
Fe_bcc ...
- class pyiron_contrib.atomistics.atomistics.job.trainingcontainer.TrainingContainer(project, job_name)[source]
Bases:
GenericJob
,HasStructure
Stores ASE structures with energies and forces.
- add_structure(structure, energy, forces=None, stress=None, identifier=None, **arrays)[source]
Add new structure to structure list and save energy and forces with it.
For consistency with the rest of pyiron, energy should be in units of eV and forces in eV/A, but no conversion is performed.
- Parameters
structure_or_job (
Atoms
) – structure to addenergy (float) – energy of the whole structure
forces (Nx3 array of float, optional) – per atom forces, where N is the number of atoms in the structure
stress (6 array of float, optional) – per structure stresses in voigt notation
name (str, optional) – name describing the structure
- collect_output()[source]
Collect the output files of the external executable and store the information in the HDF5 file. This method has to be implemented in the individual hamiltonians.
- from_hdf(hdf=None, group_name=None)[source]
Restore the GenericJob from an HDF5 file
- Parameters
hdf (ProjectHDFio) – HDF5 group object - optional
group_name (str) – HDF5 subgroup name - optional
- get_elements()[source]
Return a list of chemical elements in the training set.
- Returns
list of unique elements in the training set as strings of their standard abbreviations
- Return type
list
- get_neighbors(num_neighbors=None)[source]
Calculate and add neighbor information in each structure.
If input.save_neighbors is True the data is automatically added to the internal storage and will be saved together with the normal structure data.
- Parameters
num_neighbors (int, optional) – Number of neighbors to collect, if not given use value from input
- Returns
neighbor information
- Return type
NeighborsTrajectory
- include_dataset(dataset)[source]
Add a pandas DataFrame to the saved structures.
- The dataframe should have the following columns:
name: human readable name of the structure
atoms(
ase.Atoms
): the atomic structureenergy(float): energy of the whole structure
forces (Nx3 array of float): per atom forces, where N is the number of atoms in the structure
stress (6 array of float): per structure stress in voigt notation
- include_job(job, iteration_step=- 1)[source]
Add structure, energy and forces from job.
- Parameters
job (
AtomisticGenericJob
) – job to take structure fromiteration_step (int, optional) – if job has multiple steps, this
add (selects which to) –
- include_structure(structure, energy=None, name=None, **properties)[source]
Add new structure to structure list and save energy and forces with it.
For consistency with the rest of pyiron, energy should be in units of eV and forces in eV/A, but no conversion is performed.
- Parameters
structure_or_job (
Atoms
) – structure to addenergy (float) – energy of the whole structure
forces (Nx3 array of float, optional) – per atom forces, where N is the number of atoms in the structure
stress (6 array of float, optional) – per structure stresses in voigt notation
name (str, optional) – name describing the structure
- iter(*arrays, wrap_atoms=True)[source]
Iterate over all structures in this object and all arrays that are defined
- Parameters
wrap_atoms (bool) – True if the atoms are to be wrapped back into the unit cell; passed to
get_structure()
*arrays (str) – name of arrays that should be iterated over
- Yields
pyiron_atomistics.atomistitcs.structure.atoms.Atoms
, arrays – every structure attached to the object and queried arrays
- property plot
plotting interface
- Type
- run_if_interactive()[source]
For jobs which executables are available as Python library, those can also be executed with a library call instead of calling an external executable. This is usually faster than a single core python job.
- sample(name: str, selector: Callable[[StructureStorage, int], bool], delete_existing_job: bool = False) TrainingContainer [source]
Create a new TrainingContainer with structures filtered by selector.
self must have status finished. selector is passed the underlying
StructureStorage
of this container and the index of the structure and return a boolean whether to include the structure in the new container or not. The new container is saved and run.- Parameters
name (str) – name of the new TrainingContainer
selector (Callable[[StructureStorage, int], bool]) – callable that selects structure to include
delete_existing_job (bool) – if job with name exist, remove it first
- Returns
new container with selected structures
- Return type
- Raises
ValueError – if a job with the given name already exists.
- to_hdf(hdf=None, group_name=None)[source]
Store the GenericJob in an HDF5 file
- Parameters
hdf (ProjectHDFio) – HDF5 group object - optional
group_name (str) – HDF5 subgroup name - optional
- to_list(filter_function=None)[source]
Returns the data as lists of pyiron structures, energies, forces, and the number of atoms
- Parameters
filter_function (function) – Function applied to the dataset (which is a pandas DataFrame) to filter it
- Returns
list of structures, energies, forces, and the number of atoms
- Return type
tuple
- to_pandas()[source]
Export list of structure to pandas table for external fitting codes.
- The table contains the following columns:
‘name’: human-readable name of the structure
‘ase_atoms’: the structure as a
Atoms
object‘energy’: the energy of the full structure
‘forces’: the per atom forces as a
numpy.ndarray
, shape Nx3‘stress’: the per structure stress as a
numpy.ndarray
, shape 6‘number_of_atoms’: the number of atoms in the structure, N
- Returns
collected structures
- Return type
pandas.DataFrame
- class pyiron_contrib.atomistics.atomistics.job.trainingcontainer.TrainingPlots(train)[source]
Bases:
object
Simple interface to plot various properties of the structures inside the given
TrainingContainer
.- cell(angle_in_degrees=True)[source]
Plot histograms of cell parameters.
Plotted are atomic volume, density, cell vector lengths and cell vector angles in separate subplots all on a log-scale.
- Parameters
angle_in_degrees (bool) – whether unit for angles is degree or radians
- Returns
- contains the plotted information in the columns:
a: length of first vector
b: length of second vector
c: length of third vector
alpha: angle between first and second vector
beta: angle between second and third vector
gamma: angle between third and first vector
V: volume of the cell
N: number of atoms in the cell
- Return type
DataFrame
- coordination(num_shells=4, log=True)[source]
Plot histogram of coordination in neighbor shells.
Computes one histogram of the number of neighbors in each neighbor shell up to num_shells and then plots them together.
- Parameters
num_shells (int) – maximum shell to plot
log (float) – plot histogram values on a log scale
- energy_volume(crystal_systems=False)[source]
Plot volume vs. energy.
Volume and energy are normalized per atom before plotting.
- Parameters
crystal_systems (bool) – if True, plot & label structures of different crystal systems separately.
- Returns
- contains atomic energy and volumes in the columns ‘E’ and ‘V’; if crystal_systems is given,
also contain space groups and crystal systems of each structure
- Return type
DataFrame
- forces(axis: Optional[int] = None)[source]
Plot a histogram of all forces.
- Parameters
axis (int, optional) – plot only forces along this axis, if not given plot all forces
- shell_distances(num_shells=4)[source]
Plot a violin plot of the neighbor distances in shells up to num_shells.
- Parameters
num_shells (int) – maximum shell to plot
- spacegroups(symprec=0.001)[source]
Plot histograms of space groups and crystal systems.
Spacegroups and crystal systems are plotted in separate subplots.
- Parameters
symprec (float) – precision of the symmetry search (passed to spglib)
- Returns
- contains two columns “space_group”, “crystal_system”
for each structure in train
- Return type
DataFrame
- class pyiron_contrib.atomistics.atomistics.job.trainingcontainer.TrainingStorage[source]
Bases:
StructureStorage
- add_structure(structure: Atoms, energy, identifier=None, **arrays) None [source]
Add a new structure to the container.
Additional keyword arguments given specify additional arrays to store for the structure. If an array with the given keyword name does not exist yet, it will be added to the container.
>>> container = StructureStorage() >>> container.add_structure(Atoms(...), identifier="A", energy=3.14) >>> container.get_array("energy", 0) 3.14
If the first axis of the extra array matches the length of the given structure, it will be added as an per atom array, otherwise as an per structure array.
>>> structure = Atoms(...) >>> container.add_structure(structure, identifier="B", forces=len(structure) * [[0,0,0]]) >>> len(container.get_array("forces", 1)) == len(structure) True
Reshaping the array to have the first axis be length 1 forces the array to be set as per structure array. That axis will then be stripped.
>>> container.add_structure(Atoms(...), identifier="C", pressure=np.eye(3)[np.newaxis, :, :]) >>> container.get_array("pressure", 2).shape (3, 3)
- Parameters
structure (
Atoms
) – structure to addidentifier (str, optional) – human-readable name for the structure, if None use current structre index as string
**kwargs – additional arrays to store for structure
- include_dataset(dataset)[source]
Add a pandas DataFrame to the saved structures.
- The dataframe should have the following columns:
name: human readable name of the structure
atoms(
ase.Atoms
): the atomic structureenergy(float): energy of the whole structure
forces (Nx3 array of float): per atom forces, where N is the number of atoms in the structure
charges (Nx3 array of floats):
stress (6 array of float): per structure stress in voigt notation
- include_job(job, iteration_step=- 1)[source]
Add structure, energy and forces from job.
- Parameters
job (
AtomisticGenericJob
) – job to take structure fromiteration_step (int, optional) – if job has multiple steps, this selects which to add
- include_structure(structure, energy, name=None, **properties)[source]
Add new structure to structure list and save energy and forces with it.
For consistency with the rest of pyiron, energy should be in units of eV and forces in eV/A, but no conversion is performed.
- Parameters
structure_or_job (
Atoms
) – structure to addenergy (float) – energy of the whole structure
forces (Nx3 array of float, optional) – per atom forces, where N is the number of atoms in the structure
stress (6 array of float, optional) – per structure stresses in voigt notation
name (str, optional) – name describing the structure
- iter(*arrays, wrap_atoms=True)[source]
Iterate over all structures in this object and all arrays that are defined
- Parameters
wrap_atoms (bool) – True if the atoms are to be wrapped back into the unit cell; passed to
get_structure()
*arrays (str) – name of arrays that should be iterated over
- Yields
pyiron_atomistics.atomistitcs.structure.atoms.Atoms
, arrays – every structure attached to the object and queried arrays
- property plot
plotting interface
- Type
- to_list(filter_function=None)[source]
Returns the data as lists of pyiron structures, energies, forces, and the number of atoms
- Parameters
filter_function (function) – Function applied to the dataset (which is a pandas DataFrame) to filter it
- Returns
list of structures, energies, forces, and the number of atoms
- Return type
tuple
- to_pandas()[source]
Export list of structure to pandas table for external fitting codes.
- The table contains the following columns:
‘name’: human-readable name of the structure
‘ase_atoms’: the structure as a
Atoms
object‘energy’: the energy of the full structure
‘forces’: the per atom forces as a
numpy.ndarray
, shape Nx3‘stress’: the per structure stress as a
numpy.ndarray
, shape 6‘number_of_atoms’: the number of atoms in the structure, N
- Returns
collected structures
- Return type
pandas.DataFrame