Accurate profiling of microbial communities from massively parallel sequencing using convex optimization

Or Zuk, Amnon Amir, Amit Zeisel, Ohad Shamir, Noam Shental

نتاج البحث: فصل من :كتاب / تقرير / مؤتمرمنشور من مؤتمرمراجعة النظراء

ملخص

We describe the Microbial Community Reconstruction (MCR) Problem, which is fundamental for microbiome analysis. In this problem, the goal is to reconstruct the identity and frequency of species comprising a microbial community, using short sequence reads from Massively Parallel Sequencing (MPS) data obtained for specified genomic regions. We formulate the problem mathematically as a convex optimization problem and provide sufficient conditions for identifiability, namely the ability to reconstruct species identity and frequency correctly when the data size (number of reads) grows to infinity. We discuss different metrics for assessing the quality of the reconstructed solution, including a novel phylogenetically-aware metric based on the Mahalanobis distance, and give upper-bounds on the reconstruction error for a finite number of reads under different metrics. We propose a scalable divide-and-conquer algorithm for the problem using convex optimization, which enables us to handle large problems (with ∼ 106 species). We show using numerical simulations that for realistic scenarios, where the microbial communities are sparse, our algorithm gives solutions with high accuracy, both in terms of obtaining accurate frequency, and in terms of species phylogenetic resolution.

اللغة الأصليةالإنجليزيّة
عنوان منشور المضيفString Processing and Information Retrieval - 20th International Symposium, SPIRE 2013, Proceedings
ناشرSpringer Verlag
الصفحات279-297
عدد الصفحات19
رقم المعيار الدولي للكتب (المطبوع)9783319024318
المعرِّفات الرقمية للأشياء
حالة النشرنُشِر - 2013
الحدث20th International Symposium on String Processing and Information Retrieval, SPIRE 2013 - Jerusalem, إسرائيل
المدة: ٧ أكتوبر ٢٠١٣٩ أكتوبر ٢٠١٣

سلسلة المنشورات

الاسمLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
مستوى الصوت8214 LNCS
رقم المعيار الدولي للدوريات (المطبوع)0302-9743
رقم المعيار الدولي للدوريات (الإلكتروني)1611-3349

!!Conference

!!Conference20th International Symposium on String Processing and Information Retrieval, SPIRE 2013
الدولة/الإقليمإسرائيل
المدينةJerusalem
المدة٧/١٠/١٣٩/١٠/١٣

بصمة

أدرس بدقة موضوعات البحث “Accurate profiling of microbial communities from massively parallel sequencing using convex optimization'. فهما يشكلان معًا بصمة فريدة.

قم بذكر هذا