Published: November 5, 2025
1
1
1

Sup' #python #bioinformatics ! Didn't announce it so far, but now that it's working, I'm pleased to say you can now pip install BLAST+ (and more)! Technicalities below🧶 1/12

Image in tweet by Martin Larralde

Disclaimer, this is not an official @NCBI thing or anything, but it builds pretty heavily on the NCBI C++ Toolkit (https://www.ncbi.nlm.nih.gov/t... a unified API to the NCBI algorithms and data model, so kudos to the developers who made that possible 🤝 2/11

Among other things, the C++ Toolkit has a BLAST API, which allows getting the same results as the BLAST+ binaries we love. However, the project is huge! ~2000 C++ files to compile to get the BLAST+ functionalities. 3/12

Since this is not super practical to link statically à la PyHMMER (where you get the whole HMMER library linked statically inside pyhmmer.plan7), I instead setup a more indirect system, where the C++ Toolkit libraries are distributed as dynamic libraries in their own package 4/12

This package (https://pypi.org/project/pyncb... is built with @JFrog's Conan package manager, and distributed as Python-agnostic platform-specific platform wheels, which means you should not have to rebuild it across installs (at least on MacOS 13+ and Linux) 5/11

Image in tweet by Martin Larralde

Then the main package (https://pypi.org/project/pyncb... contains the Cython bindings linking against the runtime components (with some dynamic RUNPATH stuff that was a pain to get working), meaning I can still update them often without expecting you to perform a 1h build downstream 6/12

Once you have pyncbitk setup, you can run your BLAST from the Python interpreter, and you even get #mypy type annotations for the parameters. 7/12

Image in tweet by Martin Larralde

The docs and interface are WIP, but you can already find some working examples that show how to prepare data for BLASTn and how to recover results (https://pyncbitk.readthedocs.i... 8/12

As a proof-of-concept, I ported @torstenseemann's ABRicate into a pure-Python package (https://pypi.org/project/pyabr... and it's working like a charm! 9/12

Image in tweet by Martin Larralde

Overall it's still rough around the edges, but I think that's a major personal milestone in pushing forward the technical soundness of bioinformatics foundation tools like BLAST+! 10/12

The C++ Toolkit is actually really feature-rich and also contains some APIs to handle taxonomy, other algorithms like Gnomon or Dustmasker (though you can already use @apcamargo_ 's excellent pydustmasker for that) so I'll keep expanding functionalities in the Python part. 11/12

Code on GitHub of course (https://github.com/althonos/py... if you wanna have a look. Happy coding 🤖 ! 12/12

Share this thread

Read on Twitter

View original thread

Navigate thread

1/12