A sequence in FASTA format begins with a single-line description,
followed by lines of sequence data. The description line is
distinguished from the sequence data by a greater-than (">") symbol
in the first column. It is recommended that all lines of text be
shorter than 80 characters in length.
An example sequence in FASTA format is:
>sp|P08988|AAC4_SALSP AMINOGLYCOSIDE N3'-ACETYLTRANSFERASE IV (EC 2.3.1.81) MQYEWRKAELIGQLLNLGVTPGGVLLVHSSFRSVRPLEDGPLGLIEALRAALGPGGTLVMPSWSGLDDEPFDPATSPVTP DLGVVSDTFWRLPNVKRSAHPFAFAAAGPQAEQIISDPLPLPPHSPASPVARVHELDGQVLLLGVGHDANTTLHLAELMA KVPYGVPRHCTILQDGKLVRVDYLENDHCCERFALADRWLKEKSLQKEGPVGHAFARLIRSRDIVATALGQLGRDPLIFL HPPEGGMRRMRCRSPVDWLSSSequences are expected to be represented in the standard IUB/IUPAC amino acid codes, with these exceptions: a single hyphen or dash can be used to represent a gap of indeterminate length; and in amino acid sequences, U and * are acceptable letters (see below). Before submitting a request, any numerical digits in the query sequence should either be removed or replaced by appropriate letter codes (e.g., N for unknown nucleic acid residue or X for unknown amino acid residue).
A alanine P proline
B aspartate or asparagine Q glutamine
C cystine R arginine
D aspartate S serine
E glutamate T threonine
F phenylalanine U selenocysteine
G glycine V valine
H histidine W tryptophan
I isoleucine Y tyrosine
K lysine Z glutamate or glutamine
L leucine X any
M methionine * translation stop
N asparagine - gap of indeterminate length