askill
bio-transcription-translation

bio-transcription-translationSafety 90Repository

Transcribe DNA to RNA and translate to protein using Biopython. Use when converting between DNA, RNA, and protein sequences, finding ORFs, or using alternative codon tables.

0 stars
1.2k downloads
Updated 2/5/2026

Package Files

Loading files...
SKILL.md

Transcription and Translation

Convert between DNA, RNA, and protein sequences using Biopython.

Required Import

from Bio.Seq import Seq

Core Methods

Transcription (DNA to RNA)

dna = Seq('ATGCGATCGATCG')
rna = dna.transcribe()  # Returns Seq('AUGCGAUCGAUCG')

Transcription replaces T with U. Works on coding strand (5' to 3').

Back Transcription (RNA to DNA)

rna = Seq('AUGCGAUCGAUCG')
dna = rna.back_transcribe()  # Returns Seq('ATGCGATCGATCG')

Translation (RNA/DNA to Protein)

# From coding DNA (includes ATG start)
coding_dna = Seq('ATGTTTGGT')
protein = coding_dna.translate()  # Returns Seq('MFG')

# From RNA
rna = Seq('AUGUUUGGU')
protein = rna.translate()  # Returns Seq('MFG')

Translation Options

Stop at First Stop Codon

seq = Seq('ATGTTTGGTTAAGGG')
protein = seq.translate(to_stop=True)  # Stops at TAA, excludes stop

Include Stop Codon Symbol

seq = Seq('ATGTTTGGTTAA')
protein = seq.translate()  # Returns Seq('MFG*')

Alternative Codon Tables

Biopython supports NCBI codon tables. Common tables:

IDNameUse Case
1StandardDefault, most organisms
2Vertebrate MitochondrialHuman/vertebrate mitochondria
4Mold MitochondrialFungi, protozoa mitochondria
5Invertebrate MitochondrialInsects, worms mitochondria
6Ciliate NuclearTetrahymena, Paramecium
11Bacterial/ArchaealProkaryotes, plastids
# Bacterial translation
seq = Seq('ATGTTTGGT')
protein = seq.translate(table=11)

# Mitochondrial translation
protein = seq.translate(table=2)

# By name
protein = seq.translate(table='Vertebrate Mitochondrial')

CDS Translation (Complete Coding Sequence)

For validated coding sequences with proper start/stop:

cds = Seq('ATGTTTGGTTAA')  # Must start with start codon, end with stop
protein = cds.translate(cds=True)  # Validates and removes stop

The cds=True option:

  • Validates start codon (ATG or alternative)
  • Validates stop codon at end
  • Removes stop codon from result
  • Raises error if invalid CDS

Code Patterns

Basic Transcription and Translation Pipeline

dna = Seq('ATGTTTGGTCATTAA')
rna = dna.transcribe()
protein = rna.translate()
print(f'DNA: {dna}')
print(f'RNA: {rna}')
print(f'Protein: {protein}')

Translate All Six Reading Frames

def six_frame_translation(seq):
    frames = []
    for strand, s in [('+', seq), ('-', seq.reverse_complement())]:
        for frame in range(3):
            length = 3 * ((len(s) - frame) // 3)
            fragment = s[frame:frame + length]
            frames.append((strand, frame, fragment.translate()))
    return frames

seq = Seq('ATGCGATCGATCGATCGATCG')
for strand, frame, protein in six_frame_translation(seq):
    print(f'{strand}{frame}: {protein}')

Find All ORFs (Start to Stop)

def find_orfs(seq, min_length=30):
    orfs = []
    for strand, s in [('+', seq), ('-', seq.reverse_complement())]:
        for frame in range(3):
            trans = s[frame:].translate()
            aa_start = 0
            while True:
                start = trans.find('M', aa_start)
                if start == -1:
                    break
                stop = trans.find('*', start)
                if stop == -1:
                    stop = len(trans)
                orf = trans[start:stop]
                if len(orf) * 3 >= min_length:
                    orfs.append((strand, frame, start * 3 + frame, str(orf)))
                aa_start = start + 1
    return orfs

seq = Seq('ATGCGATCGATCGATCGATCGTAA')
for strand, frame, pos, orf in find_orfs(seq, min_length=3):
    print(f'{strand} frame {frame} pos {pos}: {orf}')

Translate with Quality Check

def translate_cds_safe(seq):
    try:
        return seq.translate(cds=True)
    except Exception as e:
        return seq.translate(to_stop=True)  # Fallback

Get Codon Table Info

from Bio.Data import CodonTable

table = CodonTable.unambiguous_dna_by_id[1]
print(f'Start codons: {table.start_codons}')
print(f'Stop codons: {table.stop_codons}')

Common Errors

ErrorCauseSolution
TranslationError: First codon is not a start codonUsed cds=True without valid startRemove cds=True or fix sequence
TranslationError: Final codon is not a stop codonUsed cds=True without stop codonRemove cds=True or add stop codon
TranslationError: Sequence length not multiple of 3Partial codons at endTrim sequence to multiple of 3
Unexpected amino acidsWrong codon tableSpecify correct table for organism

Decision Tree

Need to convert sequence?
├── DNA to RNA?
│   └── Use seq.transcribe()
├── RNA to DNA?
│   └── Use seq.back_transcribe()
├── DNA/RNA to protein?
│   ├── Complete CDS with start/stop?
│   │   └── Use translate(cds=True)
│   ├── Stop at first stop codon?
│   │   └── Use translate(to_stop=True)
│   ├── Non-standard organism?
│   │   └── Use translate(table=N)
│   └── Get all including stop symbol?
│       └── Use translate()
└── Find all ORFs?
    └── Translate all six frames, search for M...*

Related Skills

  • seq-objects - Create Seq objects for translation
  • reverse-complement - Translate both strands (six-frame translation)
  • codon-usage - Analyze codon bias in coding sequences
  • sequence-io/read-sequences - Parse GenBank files with CDS features
  • database-access/entrez-fetch - Fetch CDS sequences from NCBI for translation

Install

Download ZIP
Requires askill CLI v1.0+

AI Quality Score

95/100Analyzed 2/11/2026

A comprehensive and highly actionable guide for sequence transcription and translation using Biopython, including advanced patterns like ORF finding and six-frame translation.

90
98
95
95
98

Metadata

Licenseunknown
Version-
Updated2/5/2026
Publishermajiayu000

Tags

ci-cddatabase