PDB, ENT (Standard PDB file)

Coordinate reader	`MDAnalysis.coordinates.PDB.PDBReader`
Coordinate writer	`MDAnalysis.coordinates.PDB.PDBWriter`
Topology parser	`MDAnalysis.topology.PDBParser.PDBParser`

Reading in

MDAnalysis parses the following PDB records (see PDB coordinate section for details):

CRYST1 for unit cell dimensions A,B,C, alpha,beta,gamma

ATOM or HETATM for serial, name, resName, chainID, resSeq, x, y, z, occupancy, tempFactor, segID

CONECT records for bonds

HEADER (Universe.trajectory.header)

TITLE (Universe.trajectory.title)

COMPND (Universe.trajectory.compound)

REMARK (Universe.trajectory.remarks)

All other lines are ignored. Multi-MODEL PDB files are read as trajectories with a default timestep of 1 ps (pass in the dt argument to change this). Currently, MDAnalysis cannot read multi-model PDB files written by VMD, as VMD uses the keyword “END” to separate models instead of “MODEL”/”ENDMDL” keywords.

Important

Previously, MDAnalysis did not read elements from a file. Now, if valid elements are provided, MDAnalysis will read them in and will not guess them from atom names.

MDAnalysis attempts to read segid attributes from the segID column. If this column does not contain information, segments are instead created from chainIDs. If chainIDs are also not present, then segids are set to the default 'SYSTEM' value.

Writing out

MDAnalysis can write both single-frame PDBs and convert trajectories to multi-model PDBs. If the Universe is missing fields that are required in a PDB file, MDAnalysis provides default values and raises a warning. There are 2 exceptions to this:

chainIDs: if a Universe does not have chainIDs, MDAnalysis uses the first character of the segment segid instead.

elements: Elements are always guessed from the atom name.

These are the default values:

names: ‘X’

altLocs: ‘’

resnames: ‘UNK’

icodes: ‘’

segids: ‘’

resids: 1

occupancies: 1.0

tempfactors: 0.0

PDB specification

CRYST1 fields
COLUMNS	DATA TYPE	FIELD	DEFINITION
1 - 6	Record name	“CRYST1”
7 - 15	Real(9.3)	a	a (Angstroms).
16 - 24	Real(9.3)	b	b (Angstroms).
25 - 33	Real(9.3)	c	c (Angstroms).
34 - 40	Real(7.2)	alpha	alpha (degrees).
41 - 47	Real(7.2)	beta	beta (degrees).
48 - 54	Real(7.2)	gamma	gamma (degrees).

ATOM/HETATM fields
COLUMNS	DATA TYPE	FIELD	DEFINITION
1 - 6	Record name	“ATOM “
7 - 11	Integer	serial	Atom serial number.
13 - 16	Atom	name	Atom name.
17	Character	altLoc	Alternate location indicator.
18 - 21	Residue name	resName	Residue name.
22	Character	chainID	Chain identifier.
23 - 26	Integer	resSeq	Residue sequence number.
27	AChar	iCode	Code for insertion of residues.
31 - 38	Real(8.3)	x	Orthogonal coordinates for X in Angstroms.
39 - 46	Real(8.3)	y	Orthogonal coordinates for Y in Angstroms.
47 - 54	Real(8.3)	z	Orthogonal coordinates for Z in Angstroms.
55 - 60	Real(6.2)	occupancy	Occupancy.
61 - 66	Real(6.2)	tempFactor	Temperature factor.
67 - 72			(not used in the official PDB format)
73 - 76	String	segID	(unofficial PDB format)
77 - 78	LString(2)	element	Element symbol, right-justified.
79 - 80	LString(2)	charge	Charge on the atom.

Changed in version 2.10.0: The columns 67-72 are not read by MDAnalysis any more.

Note

The columns 73-76 are not part of the official PDB format but are used by some programs to store/operate the segment ID. For instance, Chimera assigns it as the attribute pdbSegment.