Python code to download DNA sequences
To download DNA sequences, you need to install biopython library
!pip install biopython
Code to download DNA sequences
from Bio import Entrez
from Bio import SeqIO
# Define the NCBI email to be used for accessing the data
Entrez.email = "example@email.com"
# Define the NCBI database and accession number of the DNA sequence
database = "nucleotide"
accession_number = "NC_000913.3"
# Fetch the DNA sequence from the NCBI database
handle = Entrez.efetch(db=database, id=accession_number, rettype="gb", retmode="text")
record = SeqIO.read(handle, "genbank")
# Print the DNA sequence
print(record.seq)
This code uses the Biopython library to fetch the DNA sequence from the NCBI database. The Entrez module is used to access the NCBI database and the SeqIO module is used to parse the fetched data. The efetch function is used to fetch the DNA sequence from the NCBI database and the read function is used to parse the data. The DNA sequence is then printed.
Unit Test
import unittest
from Bio import Entrez
from Bio import SeqIO
class TestDNA(unittest.TestCase):
def setUp(self):
Entrez.email = "example@email.com"
self.database = "nucleotide"
self.accession_number = "NC_000913.3"
def test_dna_sequence(self):
handle = Entrez.efetch(db=self.database, id=self.accession_number, rettype="gb", retmode="text")
record = SeqIO.read(handle, "genbank")
self.assertIsNotNone(record.seq)
self.assertGreater(len(record.seq), 0)
if __name__ == '__main__':
unittest.main()
This code uses the unittest library to write unit tests for the DNA sequence download code. The TestDNA class defines the tests and uses the setUp method to initialize the necessary objects and variables. The test_dna_sequence test checks if the DNA sequence is not None and if its length is greater than 0, indicating that the DNA sequence has been fetched correctly.