Chromosome 20 open reading frame 85, or most commonly known as C20orf85 is a gene that encodes for the C20orf85 Protein (UniProt ID: Q9H1P6) . This gene is not yet well understood by the scientific community.
It is found on chromosome 20, more specifically 20q13.32. It runs in the 5' to 3' direction on the top strand of chromosome 20. The gene HSPD1P19 or Heat Shock Protein Family D Member 1 Pseudogene 19 neighbors the gene, running before C20orf85, from the 5' to 3' end.[1]
RNA
mRNA C20orf85 has 805 nucleotides which encodes for the C20orf85 protein.[2] So far this is the only known gene variant as of June 2022 and contains 4 exons. It has also been found to be expressed in samples of the testis, endometrium, and liver from HPA RNA-seq normal tissues.[1]
Protein
This protein contains 137 Amino acids and is most commonly called "uncharacterized protein C20orf85",[3] or pfam14945. It has an approximate molecular weight of 15.5 kDa with an isoelectric point of 8.72.[2] C20orf85 protein is rich in the amino acids tryptophan and proline, compared to other human proteins.[4]
Structure
According to iTasser and AlphaFold, the C20orf85 protein structure is predicted to have many more helices than sheets.
This graph shows that the human protein c20orf85 has a moderate evolution rate. This is compared to the orange trend-line of human cytochrome c which is known to evolve very slowly and the green trend-line of human fibrinogen alpha which is known to evolve at a very fast rate.
C20orf85 is predicted to evolve at a moderate pace, slower than the known fast evolving protein Fibrinogen Alpha but faster than the known slow evolving protein Cytochrome C.
Paralog
Protein C20orf85 is paralogous with protein c2orf50. The two human proteins have been estimated to diverge from each other around 750 million years ago.
Orthologs
This table is sorted by the most confident numerical value which is the date of divergence, then by the second most confident numerical value being sequence identity compared to the human c20orf85 protein. The colors are then sorted according to class.
C20orf85 has vast amounts of orthologs including mammals, reptiles, and birds. The table to the right shows a wide range of orthologs chosen because of the type of animal they were and the sequence identity.
Interacting Proteins
C20orf85 has many interacting proteins, the proteins below were included because of their association to diseases and similarity in localization with C20orf85.
Localized in cytoplasm, cell membrane and golgi apparatus
Involved in autosomal recessive deafness
Clinical Significance
Research conducted by Kyeong-Man Hong discovered that there is an inactivation of LLC1 (C20orf85) in some patients with non-small cell lung cancer but the reason for this is currently unknown.[11] The research titled "Immunohistochemical localization of LLC1 in human tissues and its limited expression in non-small cell lung cancer" found expression in the lung but no further findings have been evaluated from that article.[12]
Mutations
There are many mutations found from the SNPs NCBI dataset of C20orf85,[13] these included below were mentioned as described in the "significance" column.
SNP
Position
(aa)
Base Change
Amino Acid Change
Mutation Type
Significance
Clinical Significance
rs1207088890
7
A-->G
Ser-->Gly
Missense
Highly conserved with phosphorylation site and O-beta-GlcNAc site