
After water, the tetrahedral oxyanions SO 4 2− and PO 4 3− are the next most frequent small-molecule species modelled, with 76 500 such molecules distributed among 18% of all structures.

Specifically, a survey of 111 976 X-ray structures from the Protein Data Bank (PDB as of 29 June 2017) revealed that 93% of deposited structures include explicit water molecules, with an average of 320 water molecules per structure (over 32 million in total), with one water modelled for every 13 non-H macromolecular atoms (Berman et al., 2000 ▸ Gnesi & Carugo, 2017 ▸). Water molecules are by far the most frequently modelled non-macromolecular species in macromolecular structures, followed by oxyanions, organic polymers and atomic ions. Thus, arriving at an optimal model for a crystal structure requires the inclusion of all scattering components, including the often numerous ordered small-molecule and solvent species within the crystal lattice (Drenth & Mesters, 2007 ▸). As a result, crystallographic models are best built through an iterative process in which more complete and accurate models result in more accurate phases which provide sharper and more interpretable electron density, which in turn allows the assembly of a further improved model, further improved phases and so on. Crystallographic model building relies on reconstructing electron density from Fourier coefficients whose complex components (phases) are ultimately derived in whole or in large part by Fourier transformation of a model of all scattering components. Besides the integral nucleic and amino-acid polymers found in macromolecular structures, other scattering components include ligands and cofactors associated with these polymers, and both ordered/explicit and bulk solvent. PeakProbe makes extensive use of cctbx libraries, and requires a PHENIX licence and an up-to-date phenix.python environment for execution.Ĭurrent techniques in macromolecular X-ray crystallography derive structural information by the construction of a comprehensive model of X-ray scattering components within a crystal system. While the program is still under development, a fully functional version is publicly available. When tasked with classifying peaks into one of four distinct solvent classes, PeakProbe achieves greater than 99% accuracy for both peaks derived directly from the atomic coordinates of existing solvent models and those based on difference density maxima. Designed to classify peaks generated from difference density maxima, PeakProbe also incorporates functionality for identifying peaks associated with model errors or clusters of peaks likely to correspond to multi-atom solvent, and for the validation of existing solvent models using solvent-omit electron-density maps. Peaks are classified based on the relative frequencies with which four different classes of solvent (including water) are observed within a given region of this score space as determined by large-scale sampling of solvent models in the Protein Data Bank. PeakProbe maps a total of 19 resolution-dependent features associated with electron density and two associated with the local chemical environment to a two-dimensional score space that is independent of resolution. PeakProbe predicts likely solvent models for a given point (termed a ‘peak’) in a structure based on analysis (‘probing’) of its local electron density and chemical environment.

PeakProbe has been developed specifically to address the need for such a tool.

However, no current tools focus on differentiating these ubiquitous water molecules from other frequently occurring multi-atom solvent species, such as sulfate, or the automated building of models for such species. Many of these tools also incorporate robust functionality for modelling the ordered water molecules that are found in nearly all macromolecular crystal structures.

#Find peaks above 5.00 rmsd in coot software software
Current software tools for the automated building of models for macromolecular X-ray crystal structures are capable of assembling high-quality models for ordered macromolecule and small-molecule scattering components with minimal or no user supervision.
