Finding more differentially altered genes by combining alteration types

With the release of cBioPortal version 3.6.8, it has become possible to more specifically select types of mutations and to combine mutations, fusions and copy number alterations in one group comparison. This can help in finding differentially altered genes for which the separate analyses do not yield a statistically significant result, but the combined analysis does, or cases where a very specific mutation type is responsible for a specific effect. In this blog post, we will illustrate how this works by guiding you through one example of each. This work was sponsored by AstraZeneca.

What is the problem?

A gene can be altered in a variety of ways. There can be mutations, fusions and/or copy number alterations. These alterations can have different effects depending on the gene that is affected. Sometimes multiple types of alteration in the same gene can have the same downstream effect.

For instance, a tumor suppressor gene like CDKN2A can stop functioning because of a point mutation in the gene, a fusion with a different gene or by a deep deletion. There are many ways to break things! So far, cBioPortal did not offer a way to consider these loss of function alterations together in a group comparison.

The same holds for gain of function alterations. Taking BRAF as an example, one of the most well known oncogenes, which is often mutated in amino acid position 600. Copy number amplification and a bunch of different fusions of BRAF are also considered oncogenic. These alterations can be considered together as gain of function alterations.

Let’s say that for loss of function alterations of gene X, we think that patient survival is decreased. Since we already know about gene X and we would like to know the impact it has on survival, we can write the following OQL:

gene X : MUT HOMDEL

to only select samples with mutations and homozygous deletions, which are likely to be loss of function alterations. Now we can go to the group comparison to see if there is a significant impact on survival.

If we would not have specified that we would like to have only mutations and homozygous deletions, additional samples would have been grouped with the altered group, e.g. copy number amplified samples, for which we do not expect a negative impact on survival. This would likely have led to a lower p-value and perhaps to an insignificant result.

But what if you do not know about gene X in advance? Before the new feature was released, it was only possible to look at differentially mutated, amplified or deleted genes, but not at combinations of these. This way you could miss a gene whose loss or gain of function has a downstream effect but for which the separate alteration types do not produce a significant result.

Let’s look at some examples!

Example: combining mutations and CNA

We will take a study with a reasonably large number of samples: Brain Lower Grade Glioma (TCGA, PanCancer Atlas) and create a group comparison on one of the clinical attributes: ‘Diagnosis Age’. This is a numeric attribute so we will create the groups based on ranges. An easy way to do is, is to use quartiles. Simply click the ‘hamburger button’ on the Diagnosis Age chart and select Compare Groups -> Quartiles.

This will bring you to the group comparison. When the screen is done loading, the top of the screen should show:

To get a better signal we will compare the youngest group (14-32) with the oldest group (53-87) by deselecting group B and C. This is done by clicking the group. We will also navigate to the genomics alterations tab, which replaces the mutations and copy number alterations tab in previous versions.

By default, all the alteration types are selected but to illustrate how this feature is an improvement on the previous version, where mutations and CNA where in separate tabs, we will select only the mutations, like this:

Because the analysis may take some time to complete and a user may want to (un)check multiple boxes, the user will have to confirm his selection by clicking the ‘Select’ button. This will prevent a few seconds of loading time between each box being checked. After pressing the button, this is what the top of the table looks like:

As you can see, PTEN is enriched in the oldest group but the result is not statistically significant.

Now try the same for CNA:

You will not see PTEN on the first page. The fastest way to find it is to use the search box:

Again, you will see that the enrichment is in the oldest group, and that it is not even close to being significant.

It should come as no surprise that we will now select both mutations and CNA, and we will find a significant enrichment in the oldest group. You can also see that there are more significant results than before.

Example: filtering mutation types

Interestingly enough, this feature can also be used to filter down results. This is helpful when looking for genes that are differentially mutated for a particular mutation type but not for other mutation types. The mutations of these other types would be randomly spread over the group and could hide the signal.

Let’s go back to our earlier example and select only mutations. This will yield only four significant results.

Now let’s select only truncating mutations. Now we get only three significant results but there are two genes that show up as significant that did not show up before: CIC and FUBP1.

We hypothesize that truncation mutations have an effect while missense mutations do not. This is confirmed if we only select missense mutations.

And this is what we see when looking at the mutations tab for CIC.

If you look at the summary to the right, you can see that, according to OncoKB, all of the truncating mutations in CIC are likely driver mutations, while this does not necessarily hold for the missense mutations.

In fact, most of these missense mutations are (likely) loss of function mutations according to OncoKB. You can confirm this by hovering over the OncoKB icon in the table. See for example CIC R215W, the most common mutation in CIC in this study:

Closing remarks

We are currently working on further improving the comparison view to allow users to filter on driver alterations and alterations of unknown significance. We will update this blog post when that feature is released on the public installation at cbioportal.org.

The Hyve provides services to develop, extend and improve features in cBioPortal, such as the one described in this article. Implemented features are released to the community via the cBioPortal repository on GitHub. We hope you like this new feature and that it proves useful in your work. Many thanks to AstraZeneca for sponsoring this work. For inquiries on cBioPortal feature development projects or other services around cBioPortal, feel free to contact us.