Using the Python library¶
kPAL provides a light-weight Python library for creating, analysing, and manipulating k-mer profiles. It is implemented on top of NumPy.
This is a gentle introduction to the library. Consult the API reference for more detailed documentation.
Profile is the central object in kPAL. It encapsulates
k-mer counts and provides operations on them.
Instead of using the
Profile constructor directly, you should
generally use one of the profile construction methods. One of those is
Profile.from_fasta(). The following code creates a 6-mer profile by
counting from a FASTA file:
>>> from kpal.klib import Profile >>> p = Profile.from_fasta(open('a.fasta'), 6)
The profile object has several properties. For example, we can ask for the k-mer length (also known as k), the total k-mer count, or the median count per k-mer:
>>> p.length 6 >>> p.total 49995 >>> p.median 12.0
Counts are stored as a NumPy
ndarray of integers, one for each
possible k-mer, in alphabetical order:
>>> len(p.counts) 4096 >>> p.counts array([ 8, 11, 5, ..., 7, 12, 13])
We can get the index in that array for a certain k-mer using the
>>> i = p.dna_to_binary('AATTAA') >>> p.counts[i] 13
Storing k-mer profiles¶
Differences between k-mer profiles¶