Using the Python library¶
kPAL provides a light-weight Python library for creating, analysing, and manipulating k-mer profiles. It is implemented on top of NumPy.
This is a gentle introduction to the library. Consult the API reference for more detailed documentation.
k-mer profiles¶
The class Profile
is the central object in kPAL. It encapsulates
k-mer counts and provides operations on them.
Instead of using the Profile
constructor directly, you should
generally use one of the profile construction methods. One of those is
Profile.from_fasta()
. The following code creates a 6-mer profile by
counting from a FASTA file:
>>> from kpal.klib import Profile
>>> p = Profile.from_fasta(open('a.fasta'), 6)
The profile object has several properties. For example, we can ask for the k-mer length (also known as k), the total k-mer count, or the median count per k-mer:
>>> p.length
6
>>> p.total
49995
>>> p.median
12.0
Counts are stored as a NumPy ndarray
of integers, one for each
possible k-mer, in alphabetical order:
>>> len(p.counts)
4096
>>> p.counts
array([ 8, 11, 5, ..., 7, 12, 13])
We can get the index in that array for a certain k-mer using the
dna_to_binary()
method:
>>> i = p.dna_to_binary('AATTAA')
>>> p.counts[i]
13
Storing k-mer profiles¶
Todo.
Differences between k-mer profiles¶
Todo.