InVitae has been granted a patent for a program stored on a machine-readable medium that clusters documents based on visual models and generates vector representations. The program also determines a sample set of documents for annotation by an AI model. GlobalData’s report on InVitae gives a 360-degree view of the company including its patenting strategy. Buy the report here.

Smarter leaders trust GlobalData


Data Insights InVitae Corp - Company Profile

Buy the Report

Data Insights

The gold standard of business intelligence.

Find out more

According to GlobalData’s company profile on InVitae, AI-assisted genome sequencing was a key innovation area identified from patents. InVitae's grant share as of January 2024 was 38%. Grant share is based on the ratio of number of grants to total number of patents.

Document clustering and annotation ai model training

Source: United States Patent and Trademark Office (USPTO). Credit: InVitae Corp

A recently granted patent (Publication Number: US11860903B1) discloses a non-transitory machine-readable medium storing a program executable by a processing unit of a device. The program involves receiving multiple documents, generating vector representations using a visual model for each document by detecting pixel values and propagating them through a neural network, clustering pages of documents based on these vector representations, selecting a sample set of documents from clusters, receiving annotations for these documents, and training an annotation AI model using the annotations. The program also includes instructions for converting document pages into images to generate vector representations and clustering images based on these representations.

Furthermore, the patent describes a method and system based on the same principles, where documents are processed using a visual model to create vector representations, cluster pages into sets based on these representations, select sample documents from clusters, receive annotations, and train an annotation AI model. The method and system also involve converting document pages into images, clustering images, and determining document similarity based on cosine similarity thresholds. The system includes processing units and a machine-readable medium storing instructions for executing the described operations. Overall, the patent outlines a comprehensive approach to document processing, clustering, and annotation using advanced visual models and neural networks, showcasing innovative techniques for efficient data analysis and organization.

To know more about GlobalData’s detailed insights on InVitae, buy the report here.

Data Insights


The gold standard of business intelligence.

Blending expert knowledge with cutting-edge technology, GlobalData’s unrivalled proprietary data will enable you to decode what’s happening in your market. You can make better informed decisions and gain a future-proof advantage over your competitors.


GlobalData, the leading provider of industry intelligence, provided the underlying data, research, and analysis used to produce this article.

GlobalData Patent Analytics tracks bibliographic data, legal events data, point in time patent ownerships, and backward and forward citations from global patenting offices. Textual analysis and official patent classifications are used to group patents into key thematic areas and link them to specific companies across the world’s largest industries.