COSMIC curation

Capturing and organising the most critical information on somatic mutations in cancer

The world’s most comprehensive resource of somatic mutation in cancer

Manual curation is at the heart of COSMIC. It is the process by which we extract information from publications and other data sources which then goes on to be reviewed, the variants recorded and additional insights added. Our extensive expertise allows us to extract the greatest level of information and maximise it into the Knowledgebase. Our Scientific Curators are accomplished scientists who undergo a further two years of specialised training to become fully qualified COSMIC curators, equipped to curate data across thousands of cancer types and related genes.

Expert curation powered by rigorous standards

A simple graphic showing a magnifying glass over the top of some papers.
Our process begins with a deep scan of available publications to see where a gene or cancer type can have additional data added, further enriching the somatic picture. We will typically target genes that have evidence of causality in cancer, curating up to 45 data points for samples with a full complement of information. Patient information, histology, pathology and more are recorded, which allows individual studies to be grouped together by using these metadata.
Leveraging over two decades of expertise, we have developed software tools able to ingest large data sets including whole exome and whole genome sequencing as well as targeted panels. Our team cannot use publications if data is incorrectly formatted and will often contact authors to advise of changes. You can read more about how to help biocurators to maximise the reach of data here.
A simple graphic showing a magnifying glass over the top of a DNA helix on a computer screen.
A simple graphic showing a "quality approved" stamp with the COSMIC logo in the centre

Once data is entered into our system we apply additional analyses, for example, we utilise the Variant Effect Prediction (VEP) tool to show the impact of SNVs, insertions, deletions, CNVs and structural variants. The additional analyses are integral to each release process, to update both the web platform and compile the download files twice a year. Once reviewed and approved for inclusion into the Knowledgebase, the output of COSMIC’s expert curation process increases the findability of the data. The accessible nature of the Knowledgebase gives greater power for both interoperability and reproducibility downstream.

Additionally, our team includes Scientific Curators dedicated specifically to our Actionability resource. This specialised curation tracks the availability of drugs targeting key mutations, and the progress of clinical studies towards making new drugs available.

With our meticulous curation, you are provided with FAIR compliant data organised for precision and efficiency. Whether driving breakthroughs in oncology or refining clinical insights, our Knowledgebase offers the confidence and the up-to-date data you need for reproducible results.
A simple graphic showing a database on the left side, with three lines leading to three papers on the left hand side, indicating these have been ingested into the database.

Explore how to download and use COSMIC data