InVitae has been granted a patent for a method of identifying mutations in nucleic acid. The method involves sequencing the nucleic acid to generate sequence reads, assembling the reads into a contig, aligning the contig to a reference genome, aligning individual reads to the contig, and identifying mutations based on the alignments. The patent also includes a method for determining genotype information for the sample organism based on the identified mutations. GlobalData’s report on InVitae gives a 360-degree view of the company including its patenting strategy. Buy the report here.
According to GlobalData’s company profile on InVitae, AI-assisted genome sequencing was a key innovation area identified from patents. InVitae's grant share as of June 2023 was 1%. Grant share is based on the ratio of number of grants to total number of patents.
The patent is granted for a method of identifying mutations in nucleic acid sequences
A recently granted patent (Publication Number: US11667965B2) describes a method for genotyping nucleic acids from a sample organism. The method involves receiving a plurality of sequence reads from the nucleic acid and assembling them into an assembly that represents a contiguous region of the nucleic acid. The assembly is then aligned to a reference genome to generate an assembly-reference alignment. Variants in the assembly relative to the reference genome are identified, and each of the sequence reads is aligned to the assembly to generate a read-assembly alignment. Based on the assembly-reference alignment and the read-assembly alignment, a variant state is assigned to each of the variants relative to the reference genome, allowing for the determination of genotype information for the sample organism.
The method described in the patent involves using the Burrows-Wheeler Aligner (BWA) to map the sequence reads to the reference genome and assemble the mapped reads into the assembly. The variant state can be a mutation, insertion, or deletion. Aligning each sequence read to the assembly generates a likelihood score, which is used to assign a variant state to each sequence read. The assembly-reference alignment and the locations of the variants are described in a Concise Idiosyncratic Gapped Alignment Report (CIGAR) string. The method also includes genotyping the sample organism based on the variant state of the variants, and the variant states for each variant can be output in a Variant Call Format (VCF) file.
Another method described in the patent involves obtaining a sample comprising nucleic acid, sequencing the nucleic acid to generate a plurality of sequence reads, and detecting a variant in the sequence reads relative to a reference genome. This method includes assembling a contig from the sequence reads, aligning the contig to the reference genome to obtain a contig-to-reference alignment, and aligning each sequence read to the aligned contig to obtain a read-to-contig alignment. The contig-to-reference alignment is then aligned to the read-to-contig alignments to obtain read-to-reference alignments, which describe the position of the sequence reads relative to the reference genome and the differences between the sequence reads and the reference genome. The detected variant is outputted based on these differences.
The methods described in the patent can be implemented using a computer system with a processor and memory. The reference genome can be stored in the main memory, and the alignment steps can be performed using the Burrows-Wheeler Aligner (BWA)-long algorithm for aligning the contig to the reference genome and the BWA-short algorithm for aligning each sequence read to the aligned contig. The alignments are represented in BAM files, and the differences between the aligned contig and the reference genome are described in a CIGAR string. The read-to-contig CIGAR string is adjusted to include the read-to-reference alignments, resulting in a read-to-reference CIGAR string that identifies the variant relative to the reference genome.