Data Item _pdbx_investigation_entity_poly.seq_one_letter_code_sample

General

Item name
_pdbx_investigation_entity_poly.seq_one_letter_code_sample
Category name
pdbx_investigation_entity_poly
Attribute name
seq_one_letter_code_sample
Required in PDB entries
no
Used in current PDB entries
No

Item Description

Canonical sequence of protein or nucleic acid polymer in standard one-letter codes of amino acids or nucleotides, corresponding to the sequence in _pdbx_investigation_entity_poly.seq_one_letter_code_with_ntsd. Non-standard amino acids/nucleotides are represented by letter 'X', or the parent amino acid/nucleotide if the Chemical Component Dictionary (CCD) code has _chem_comp.mon_nstd_parent_comp_id record. For modifications with several parent amino acids, all corresponding parent amino acid codes will be listed (ex. chromophores). Deoxynucleotides are represented by their canonical one-letter codes of A, C, G, or T. A for alanine or adenine B for ambiguous asparagine/aspartic-acid R for arginine N for asparagine D for aspartic-acid C for cysteine or cystine or cytosine Q for glutamine E for glutamic-acid Z for ambiguous glutamine/glutamic acid G for glycine or guanine H for histidine I for isoleucine L for leucine K for lysine M for methionine F for phenylalanine P for proline S for serine T for threonine or thymine W for tryptophan Y for tyrosine V for valine U for uracil O for water X for other

Item Example

 
MSHHWGYGKHNGPEHWHKDFPIAKGERQSPVDIDTHTAKYDPSLKPLSVSYDQATSLRILNNGAAFNVEFD

Data Type

Data type code
text
Data type detail
Multi-line text
Primitive data type code
char
Regular expression
[ \n\t()_,.;:"&<>/\{}'`~!@#$%?+=*A-Za-z0-9|^-]*