Saturday 25 February 2017

Download specific DNA sequences from hg19 using Python

I've been working on a little side-project recently that involved needing to grab lots of different human DNA sequences based on their position, which lead me to discover the wonderful UCSC DAS server (from this informative Biostars thread).

Seeing as the rest of the project was written in Python, I knocked together a quick function to do just that. It's all nice and easy: just give it the chromosome number/letter*, and a numerical start and stop position, and the function returns the hg19 DNA sequence in that range.

I'm also trying to make a bit more use of GitHub (including knocking together a place for my publications), so I thought this was the perfect thing to make a gist from:

* Currently this function won't be able to grab anything from the unassigned chromosome contigs - just chromosomes 1-22, X, Y and mitochondrial (M) sequences.

1 comment:

  1. Thanks for posting this! I had to modify it for python3, but it's exactly what I needed.

    ReplyDelete