Biological SciencesSubstack

Paired Ends

Bioinformatics, computational biology, and data science updates from the field. Occasional posts on programming.
Home PageRSS FeedMastodon
language
Published
Author Stephen Turner

This week’s recap highlights the Evo model for sequence modeling and design, biomedical discovery with AI agents, improving bioinformatics software quality through teamwork, a new tool from Brent Pedersen and Aaron Quinlan (vcfexpress) for filtering and formatting VCFs with Lua expressions, a new paper about the NHGRI-EBI GWAS Catalog, and a review paper on designing and engineering synthetic genomes.

Published
Author Stephen Turner

This week’s recap highlights a new way to turn Nextflow pipelines into web apps, DRAGEN for fast and accurate variant calling, machine-guided design of cell-type-targeting cis-regulatory elements, a Nextflow pipeline for identifying and classifying protein kinases, a new language model for single cell perturbations that integrates knowledge from literature, GeneCards, etc., and a new method for scalable protein design in a relaxed sequence

Published
Author Stephen Turner

This week’s recap highlights the WorkflowHub registry for computational workflows, building a virtual cell with AI, a review on bioinformatics methods for prioritizing causal genetic variants in candidate regions, a benchmarking study showing deep learning methods are best for variant calling in bacterial nanopore sequencing, and a new ML model from researchers at Genentech for predicting cell-type- and condition-specific gene expression across

Published
Author Stephen Turner

This week’s recap highlights pangenome graph construction with nf-core/pangenome, building pangenome graphs with PGGB, benchmarking algorithms for single-cell multi-omics prediction and integration, RNA foundation models, and a Nextflow pipeline for characterizing B cell receptor repertoires from non-targeted bulk RNA-seq data.

Published
Author Stephen Turner

This week’s recap highlights an AI agent for automated multi-omic analysis (AutoBA), rapid species-level metagenome profiling and containment (sylph), a review on genome-wide association analysis beyond SNPs, private information leakage from scRNA-seq count matrices, and a method to “unlearn” viral knowledge in protein language models as a means to develop safe PLM-based variant effect analysis (PROEDIT). Others that caught my attention include

Published
Author Stephen Turner

This week’s recap highlights a new Nextflow workflow for calculating polygenic scores with adjustments for genetic ancestry, a paper demonstrating that whole exome plus imputation on more samples is more powerful than whole genome sequencing for finding more trait associated variants, a new deep-learning-based splice site predictor that improves spliced alignments, a new method for accurate community profiling of large metagenomic datasets, and

Published
Author Stephen Turner

This week’s recap highlights a new method for gene-level alignment of single-cell trajectories, an R package for integrating gene and protein identifiers across biological sequence databases, characterization of SVs across humans and apes, universal prediction of cellular phenotypes, a method to quantify cell state heritability versus plasticity and infer cell state transition with single cell data, and a new AI-driven, natural language-oriented