# This R-executable script extracts information from the two downloadable Human Protein Atlas expression datasets at https://www.proteinatlas.org/about/download,
# for rna (https://www.proteinatlas.org/download/rna_tissue_consensus.tsv.zip) and protein (https://www.proteinatlas.org/download/normal_tissue.tsv.zip) respectively.
# For each gene, the 3 organs (or tissue groups) with the highest expression in the rna dataset and the protein dataset are identified. In the case of ties,
# a frequent occurrence in the discrete categories of the protein dataset, more than 3 organs are taken.
# These data can then be cross-referenced with a list of genes of interest, in our case with the list of genes with CNVs at population level.
# Before running the script, change to your own reference paths.