PyIR: IgBLASTを用いた数十億の免疫グロブリンやT細胞受容体配列を処理するためのスケーラブルなラッパー

/ /

日本語AIでPubMedを検索

PubMedの提供する医学論文データベースを日本語で検索できます。AI(Deep Learning)を活用した機械翻訳エンジンにより、精度高く日本語へ翻訳された論文をご参照いただけます。

BMC Bioinformatics.2020 Jul;21(1):314. 10.1186/s12859-020-03649-5. doi: 10.1186/s12859-020-03649-5.Epub 2020-07-16.

PyIR: IgBLASTを用いた数十億の免疫グロブリンやT細胞受容体配列を処理するためのスケーラブルなラッパー

PyIR: a scalable wrapper for processing billions of immunoglobulin and T cell receptor sequences using IgBLAST.

Cinque Soto
Jessica A Finn
Jordan R Willis
Samuel B Day
Robert S Sinkovits
Taylor Jones
Samuel Schmitz
Jens Meiler
Andre Branchizio
James E Crowe

PMID: 32677886 DOI: 10.1186/s12859-020-03649-5.

抄録

背景:

近年の DNA シーケンシング技術の進歩により、大量の DNA 配列データを生成する能力が飛躍的に向上したため、抗体可変遺伝子レパートリーを調べる手段としてバイオインフォマティクスの利用が急速に増加しています。抗体配列のアノテーションに使用されている一般的なツールは、機能性、モジュール性、使いやすさに制限があります。

BACKGROUND: Recent advances in DNA sequencing technologies have enabled significant leaps in capacity to generate large volumes of DNA sequence data, which has spurred a rapid growth in the use of bioinformatics as a means of interrogating antibody variable gene repertoires. Common tools used for annotation of antibody sequences are often limited in functionality, modularity and usability.

結果:

我々は、最小限のセットアップCLIとAPI、FASTQサポート、大規模なシーケンスファイルのファイルチャンキング、JSONとPython辞書出力、組み込みのシーケンスフィルタリングを提供するIgBLASTのためのPythonラッパーとライブラリであるPyIRを開発しました。

RESULTS: We have developed PyIR, a Python wrapper and library for IgBLAST, which offers a minimal setup CLI and API, FASTQ support, file chunking for large sequence files, JSON and Python dictionary output, and built-in sequence filtering.

結論:

PyIRは、単一のコンピュータシステム上で16以上のプロセスを生成する場合、マルチスレッド IgBLAST (バージョン1.14)よりも処理速度が向上します。カスタマイズ可能なフィルタリングとデータカプセル化により、幅広いコンピューティング環境に適応することができます。このAPIは、カスタマイズされたバイオインフォマティクスワークフローでIgBLASTを使用することを可能にします。

CONCLUSIONS: PyIR offers improved processing speed over multithreaded IgBLAST (version 1.14) when spawning more than 16 processes on a single computer system. Its customizable filtering and data encapsulation allow it to be adapted to a wide range of computing environments. The API allows for IgBLAST to be used in customized bioinformatics workflows.