Osteoarthritis year in review 2019: genetics, genomics and epigenetics

Although osteoarthritis (OA) aetiology is complex, genetic, genomic and epigenetic studies published within the last decade have advanced our understanding of the molecular processes underlying this common musculoskeletal disease. The purpose of this narrative review is to highlight the key research articles within the OA genetics, genomics and epigenetics fields that were published between April 2018 and April 2019. The review focuses on the identification of new OA genetic risk loci, genomics techniques that have been used for the first time in human cartilage and new publicly available databases, and datasets that will aid OA functional studies. Fifty-six new OA susceptibility loci were identified by two large scale genome wide association study meta-analyses, increasing the number of genome-wide significant risk loci to 90. OA risk variants are enriched near genes involved in skeletal development and morphology, and show genetic overlap with height, hip shape, bone area and developmental dysplasia of the hip. Several functional studies of OA loci were published, including a genome-wide analysis of genetic variation on cartilage gene expression. A specialised data portal for exploring cross-species skeletal transcriptomic datasets has been developed, and the first use of cartilage single cell RNAseq analysis reported. This year also saw the systematic identification of all microRNAs, long non-coding RNAs and circular RNAs expressed in human OA cartilage. Putative transcriptional regulatory regions have been mapped in human chondrocytes genome-wide, providing a dataset that will facilitate the prioritisation and characterisation of OA genetic and epigenetic loci. © 2020 The Authors. Published by Elsevier Ltd on behalf of Osteoarthritis Research Society International. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).


Introduction
Osteoarthritis (OA) is a chronic musculoskeletal disease characterised by the destruction of articular cartilage, synovial inflammation and bone remodelling. Although disease aetiology is complex, our understanding of the molecular processes underlying OA has been advanced through genetic, genomic and epigenetic studies 1e3 . The goal of this annual review is to summarise the progress made within this field between April 2018 and April 2019. We have chosen to focus on the identification of new OA genetic risk loci and on new genomics techniques that have been used in human cartilage for the first time. We also highlight new publically available datasets and databases that will aid future functional studies of OA genetic and epigenetic loci.

Identification of new OA genetic risk loci
Identification of genetic loci using hypothesis-free genome wide association studies (GWAS) has given novel insights into diseases 4e6 , identified potential drug targets 7 , and led to the use of polygenic risk scores in selecting individuals for clinical trials 8 . However, until recently, identification of OA loci was trailing behind other complex diseases and phenotypes such as rheumatoid arthritis (over 100 risk loci; 9 ) and height (over 3,000 loci; 10 ), with only 19 OA loci reported up to April 2017 at genome wide significance level of P 5 Â 10 À8 (Fig. 1(A), Table I, 11e21 ). The number of significant OA genetic risk loci has now increased to 90 (Table I, 22e29 ), the majority of which have small effects sizes (odds ratios [OR] 1.03 to 1.25, Fig. 1(B)). As the number of samples included in the analysis has a significant effect on the number of loci identified 30 , the increase in loci is largely explained by the publication of several large scale, well powered OA GWAS. Four of these studies have utilised genotyping and medical data for ~488,000 individuals available as part of the UK biobank (UKBB; https://www.ukbiobank.ac.uk/), together with existing OA GWAS cohorts 26e29 . The two largest OA analyses were published within the last year and together they identified 56 new loci ( Fig. 1(B) and (C); 28,29 ). Unlike the majority of previous OA studies, which have employed the traditional GWAS design of signal discovery followed by replication, both studies performed a meta-analysis of the UKBB data together with the Icelandic deCODE genetics 28 or UK arcOGEN datasets 29 . This meta-analysis approach has several advantages, including increasing the power to detect genetic variants with small to modest effects sizes whist reducing the probability of falsenegative results and identify effect size heterogeneity across cohorts 31 .
The deCODE-UKBB study included over 650,000 British and Icelandic individuals, making it the biggest OA GWAS published to date 28 . They analysed 11.6 million genotyped variants present in both cohorts, performing separate meta-analyses for hip and knee OA [ Fig. 1(C)]. The same OA definition was used for both cohorts, OA genetic risk loci. A) Number of OA genetic risk loci identified at the genome-wide significance level by date. The number of loci has increased from 19 to 90 since the release of the UK biobank (UKBB) genotyping data, which was used in four OA GWAS studies published in 2018 and 2019 26e29 . A full list of genome-wide significant OA loci is available in Table I. B) Odds ratio (OR) of OA risk alleles plotted against allele frequency for all OA SNPs listed in Table I. Arrows point towards the variants with the highest OR and highest OA risk allele frequency, with the nearest gene to these two variants indicated. Note, for some loci several SNPs have been reported. C) Overlap in the new OA risk loci identified in the deCODE-UKBB and arcOGEN-UKBB studies 29,30 . The name of the nearest gene to the associated DNA variant is given; this may not be target gene of the OA. The variant identifier for each OA signal is given in Table I. *The OA variant is also associated with height at P < 5 Â 10 À828 . The number of OA cases and controls for each analysis is shown in brackets. # Two independent signals were reported overlapping the MAPT gene in the arcoGEN-UKBB study. Two independent COL11A1 signals were identified in the deCODE-UKBB study, one of which was also identified in the arcOGEN-UKBB study and was associated with height in 28 .

Osteoarthritis andCartilage
and individuals with known causes of secondary OA (e.g., prior ACL injury and acetabular dysplasia) were excluded, yielding a total of 17,151 OA hip cases, 23,877 OA knee cases and up to 619,289 controls (individuals without any medical record of OA). They identified 23 independent associations at 22 loci in the additive metaanalyses (odds ratios [OR] 1.06 to 2.84) and two loci using a recessive mode (OR 1.95e5.89). This included 12 novel hip loci, four novel knee loci and replication of numerous existing loci including CHADL, GDF5 and FILIP1. The arcOGEN-UKBB study analysed up to 17.5 million variants in over 455,000 UK individuals, reporting 65 genome-wide significant variants mapping to 64 loci 29 . Separate analyses were performed for four OA phenotypes; OA hip, OA knee, hip and/or knee OA and OA at any site, although different OA definitions were used in the two cohorts. Fifty-two novel loci with ORs of 1.03e1.83 were reported, comprising of 15 new hip loci, seven knee loci, six hip and/or knee loci and 24 loci for OA at any site. Of the novel loci reported by these two meta-analyses, twelve were significant at the genome-wide level in both studies, including variants mapping to the LMX1B, COL11A1 and IL11 genes (Table I, Fig. 1(C)); a larger overlap is observed when considering all signals that reach genome-wide suggestive P values These studies confirmed that there are joint-site differences in genetic risk, with the majority of loci only associated with OA at a specific joint (Table I). For example, the two signals with the most significant P values in both studies, mapping to the GDF5 and PTHLH loci, were only significant for knee or hip OA respectively.
Pathway analysis revealed that OA loci are enriched near genes involved in skeletal development and/or associated with rare monogenic bone diseases 29 . This includes the TGFb pathway genes TGFB1, LTBP1, LTBP3 and SMAD3, and the recently identified ROCR long non-coding RNA (lncRNA) that acts upstream of SOX9 during chondrogenic differentiation 32 . New OA pathways may be identified in the near future with the publication of the first multi-ethnic OA GWAS meta-analysis study performed by the Genetics of Osteoarthritis consortium ( 33 ; https://www.genetics-osteoarthritis. com/home/index.html). This is a global collaboration of 18 GWAS cohorts from Europe, Japan, Hong Kong and the USA, and includes more than twice the number of OA cases used in previous GWAS studies. In addition to hip and knee OA, meta-analyses are also being performed for spine, thumb and finger OA 33 .
Genetic overlap of OA with hip morphology, height, bone area and DDH It has been hypothesised that OA genetic risk loci may act during skeletal development, causing subtle differences in joint development and shape that, through alterations of joint biomechanics, predispose the individual to OA later in life 3,34,35 . Several genetic studies published this year have provided further evidence of this hypothesis, with DNA variants associated with height 28 , hip shape 36 , and developmental dysplasia of the hip (DDH; 37 ) overlapping OA loci. The genetic link between OA risk and height was examined in the deCODE-UKBB GWAS, with 12/25 OA loci associated with increased (e.g., FAM101A, FILIP1) or decreased height (e.g., COL11A1, GDF5) at the genome-wide significance level (Table I,  Supplemental Table 1; 28 ). The first GWAS for hip shape used DXA scans of almost 16,000 individuals from five cohorts, and identified nine loci at P < 5 Â 10 À9 and 12 at P < 5 Â 10 À836 . Two signals overlapped the established OA hip loci near the PTHLH and ASTN2, and another signal was located downstream of ROCR, overlapping the ROCR/SOX9 OA knee loci reported in the arcoGEN-UKBB GWAS 29 . Co-localisation analysis suggested a shared causal variant for hip shape, OA and height at the ASTN2 locus, and for OA and hip shape at the PTHLH locus.
DDH is the most common developmental musculoskeletal disease and is characterised by abnormal development of the hip joint, a risk factor for degenerative hip diseases including OA. The largest DDH GWAS to date identified an association between the GDF5 OA knee SNP rs143384 and DDH in European individuals 37 , confirming previous reports of a suggestive association in Asian populations 38 . rs143384 explained approximately 1% of DHH heritability, with fine mapping indicating this SNP had >99% likihood of being the causal DDH variant at the locus. rs143384 is also associated with hip intertrochanteric/shaft and trochanter bone area as well as bone area of the lumbar spine 39 . Although GDF5 is not a risk locus for hip OA in Europeans 28,29 , a new study in mice demonstrated a direct link between this gene and hip morphology 40 . Compared to heterozygous mice, mice lacking the GDF5 gene had abnormal proximal femur and acetabular morphology, including smaller femoral heads and neck. These dysmorphologies are concordant with changes that cause hip instability, injury and adult onset OA in humans.

Functional studies of OA loci
Once a DNA region harbouring OA genetic risk has been identified by GWAS, functional studies are required to pinpoint the causal variant(s), target gene(s), cell or tissue type in which this risk allele is acting, and when in the lifecourse this occurs. Although such functional studies have lagged behind identification of risk loci for many diseases 41 , there have been several functional studies of individual OA loci published this year. Together, these studies have prioritised candidate genes 42 , highlighted potential mechanisms of action 42e44 , and examined the role of target genes in cartilage homeostasis and disease using mouse models 45,46 . However, with identification of 56 new loci in the last year, there is a need to prioritise specific DNA variants and genes from within risk loci for functional studies in a rapid and systematic way.
The majority of common disease causing SNPs, including those for OA, are thought to act by altering transcription factor binding, subtly affecting transcription of nearby genes such that one allele drives higher gene expression than the other, termed allelic imbalance (AI). Online tools such as HaploReg 47 and LDlink 48 are invaluable for prioritising potential target genes and highlighting putative causal variants within a locus, some of which contain 100se1000s of variants in high linkage disequilibrium (LD) with the OA SNP (see Table I). These tools integrate genetic data from the 1,000 Genomes Project 49 with chromatin status, transcription factor binding, and DNaseI hypersensitivity (DHS) mapping generated by the ENCODE 50 and Epigenome Roadmaps projects 51 , and gene expression data from the GTEx Project (https://gtexportal.org/ home/). Although chromatin state information is available for isolated osteoblasts and in vitro differentiated chondrocytes, there are limitations in the application of these tools for OA studies as the majority of data used for variant and gene prioritisation has been generated in non-synovial joint tissues.
However, a study published this year sought to detect all transcript SNPs showing AI in OA cartilage 52    Osteoarthritis andCartilage is actually associated with increased expression of the nearby CDK2AP1 gene rather than SBNO1 itself, highlighting the usefulness of this dataset for prioritising genes as well as variants within OA susceptibility regions.

Transcriptomics
A wealth of novel human skeletal transcriptomic analyses were published the last year, including RNAseq based datasets of OA cartilage 52e55 , timecourse analysis of in vitro chondrogenesis 56,57 and profiling of human foetal chondrocytes, myoblasts, osteoblasts, ligamentocytes and tenocytes 57 . Such RNAseq datasets generate a huge treasure trove of gene expression information, although the published studies typically focus only on a small subset of the data, with the remaining data either relegated to supplemental files or not provided. Furthermore, it is difficult to compare the results between studies due to differences in data analysis methods. These issues have been addressed by SkeletalVis, a new online web application designed as a specialised meta-analysis portal for skeletal disease transcriptomic datasets (http://phenome. manchester.ac.uk/ 58 ). The portal currently contains microarray and RNAseq data from 300 studies encompassing a total of 779 individual analyses in cartilage and other OA-relevant tissues. SkeletalVis permits exploration and comparison of existing human and animal model datasets, including the identification of gene signatures across studies and species. Furthermore, the user can upload their own unpublished data in order to compare it to existing datasets and perform downstream analyses, including transcription factor and gene ontology enrichment analysis. The portal also allows the expression of a specific gene to be compared across the different experiments, a very useful tool for preliminary analysis of OA-associated genes identified through GWAS, transcriptomic and epigenomic studies.
Although bulk RNAseq studies are now the norm, advances in sequencing technology have allowed the transcriptome to be analysed at the level of individual cells. Single cell RNA sequencing (scRNAseq) is becoming increasingly popular, leading to new insights into development and disease 59 . This year, the first human cartilage single-cell RNAseq analysed the transcriptome of 1,464 chondrocytes isolated from the tibial plateau of 10 knee patients 60 . Seven molecular subgroups of OA chondrocyte were identified, including three novel populations termed effector (EC), regulatory (RegC) and homeostatic (HomC) chondrocytes. These subgroups can be classified based on expression of 792 genes, several of which map to newly identified OA susceptibility regions (e.g., GLIS3, TGFB1, TNC and WWP2). Single cell analysis of OA synovium has also been performed, with Chou and colleagues reporting analysis of over 10,000 synovial cells and 26,000 OA chondrocytes from three OA knee patients at OARSI 2019 61 . The number of single-cell RNAseq analyses of OA-relevant tissues is expected to rise over the next few years and these studies have the potential to identify new pathological cell types and pathways.

MicroRNAs
Numerous microRNAs (miRNAs) have now been associated with cartilage development or homeostasis and during the development of OA 62 . To identify candidate miRNAs for future miRNA-based OA therapies, a study this year aimed to identify all cartilage miRNAs involved in OA pathophysiology 53 . They leveraged mRNA and small RNA-seq data from 19 paired damaged and intact OA knee and hip cartilage samples in order to create a chondrocyte-specific interaction network of miRNAs with their target mRNAs. Utilising predicted and validated miRNA target databases, an 'OA-specific miRNA interactome' of 62 differentially expressed miRNAs and 238 differentially expressed target mRNAs was generated. One notable miRNA was miR-99a-3p, which had not been associated with OA prior to the study. miR-99a-3p was downregulated in lesioned OA cartilage and targeted 36 genes, including FZD1, ITGB5 and GDF6. A second large miRNAemiRNA cluster centred on the upregulated miRNA miR-143e3p, and consisted of 16 target genes including DCAKD, AMIGO1 and SMAD3. Importantly, functional validation was performed for both miR-99a-3p and miR-143e3p, confirming several of the identified target interactions.
A number of other studies published this year have further expounded targets of miRNAs in cartilage and bone with implications for skeletal development and homeostasis. One study focused on miR-204 63 , which is upregulated in human damaged OA cartilage and mouse cartilage with age or after OA induction. miR-204 is also induced by number of senescence inducers (including H202 and infra-red radiation). Interestingly miR-204 repressed chondrocyte sulphated glycosaminoglycan production by directly targeting a number of transcripts involved in cartilage proteoglycan biosynthesis, including genes crucial for chondroitin sulphate (CS) and hyaluronic acid formation. Furthermore, in vivo intra-articular injections of a miR-204 mimic exacerbated cartilage damage in the mouse DMM model, while miR-204 inhibitor injection caused a reciprocal protection against experimental OA. Inhibition of miR-204 in human chondrocytes could similarly upregulate CS levels and suppress catabolic MMP expression.
A second study focused on miR-181a-5p, which is increased in degenerated human facet joints and OA knee cartilage, and during mouse OA development 64 . Injection of antisense oligonucleotides, which attenuate miR-181a-5p activity, reduced cartilage damage in knee joints with concomitant reduction in catabolic gene expression and markers of cartilage damage. This finding was confirmed in human cartilage explants and further work is required to characterise the targets and pathways mediating the effect of miR-181e5p in chondrocytes.
The crucial role of miR-140 in skeletal development is well established 65,66 , and a novel pathological mutation within this miRNA was recently described 67 . An A > G nucleotide substitution within the miR-140e5p seed region caused a novel human skeletal dysplasia with features including short stature, brachydactyly and delayed epiphyseal ossification leading to degenerative joint disease. This mutation causes both de-repression of conserved miR-140e5p targets and suppression of an additional repertoire of genes that are targeted by the new seed sequence. These include genes required for skeletal development such as Loxl3 and Hif1a. Importantly, a knock-in mouse model containing the human miR-140e5p seed mutation had skull, cartilage and long bone development alterations, phenocopying the skeletal dysplasia features of the patients.
Other key findings in the miRNA field include the observation that miR-138-mediated inhibition of osteogenesis could be rescued by overexpression of its target RhoC 68 , miR-93e5p may target TCF4 to prevent cartilage matrix degradation 69 , and finally, that miR-324e5p is upregulated in OA and targets the hedgehog signalling pathway in chondrocytes 70 .

Other non-coding RNAs
Unlike miRNAs, research into the role of other types of noncoding RNAs in OA and cartilage remains in its infancy. These include long noncoding RNAs (lncRNAs), a diverse group of noncoding transcripts over 200 nt in length. Although the role of individual lncRNAs such as ROCR 32 have been investigated (reviewed in 71 ), the first study to characterise lncRNAs genome-wide in OA cartilage was published this year 54 . Ajekigbe and colleagues sought to define the lncRNA transcriptome in OA cartilage by analysing both hip and knee cartilage RNA-seq data and identified a total of 1834 lncRNAs. Twenty lncRNAs were significantly differentially expressed between both OA and non-OA hip cartilage and preserved and damaged regions of OA knee cartilage, including the imprinted gene MEG3. Future work will focus on the identification of lncRNA functions in cartilage, thereby enabling investigation into the consequence of lncRNA expression changes in OA.
Circular RNAs (circRNAs) are single-stranded RNAs usually formed by alternative splicing of pre-mRNAs in which the 5 0 and 3 0 ends have been spliced together to form a loop 72 . Systematic analysis of circRNAs expressed in OA cartilage identified CircSER-PINE2, which is formed by circularisation of exons 2e4 of the serine protease inhibitor gene SERPINE2 73 . The expression of CircSER-PINE2 in human cartilage was comprehensively validated and found to be downregulated in OA in contrast to the linear SERPINE2 transcript. This circRNA is downregulated during OA and intraarticular overexpression of CircSERPINE2 alleviated OA severity in a rabbit model. Specific inhibition of CircSERPINE2 by siRNA caused the suppression of key cartilage genes such as SOX9 and the upregulation of catabolic genes such as MMP13. CircSERPINE2 pulldown experiments identified binding of miR-1271, which is upregulated in OA, and the authors posited that CircSERPINE2 acts as a sponge for miR-127e5p. Indeed overexpression of miR-1271 in chondrocytes phenocopied the effect of CircSERPINE2 siRNA, while inhibition of miR-1271 could rescue the phenotype. However, miR-1271 is relatively lowly expressed in chondrocytes and other studies have failed to detect expression changes in OA suggesting other mechanisms of CircSERPINE2 action may still be discovered 74,75 .

Genome-wide mapping of open chromatin and histone modifications
Within the OA epigenetics field, the majority of research has centred on DNA methylation and non-coding RNAs. The few studies of histone modifications have primarily focussed on the role of individual chromatin remodelling proteins in cartilage (e.g., DOT1L; 45,76 ) rather than genome-wide mapping of specific histone modifications and open chromatin. Techniques used for the latter analyses, namely chromatin immunoprecipitation sequencing (ChIPseq) and DHS mapping, have previously required unfeasibly large number of freshly isolated cells (typically 1 Â 10 6 to 2 Â 10 7 cells), preventing such studies in cartilage. However, this has changed within the last year with publication of the first reports of histone ChIPseq and mapping of open chromatin regions in human chondrocytes 57,77 .
In the first study, Ferguson and colleagues performed ChIPseq of fetal and adult chondrocytes for the active histone modifications H3K4me1, H3K4me3 and H3K27ac, and the repressive H3K27me3 mark using only 10,000 chondrocytes per ChIP 57 . ChIPseq was also generated from pluripotent stem cells at day 14 and day 60 of in vitro chondrogenic differentiation. Matched RNAseq data available for both in vitro differentiation timepoints as well as the fetal and adult chondrocyte subtypes. Based on the combination of different histone modifications, 12 chromatin states were identified, including putative active promoter enhancer regions. Given the utility of the chromatin state information for future epigenetic studies, the authors have made the ChIPseq and RNAseq data available to download (GSE111850 and GSE106292 respectively).
In the second study 77 , the open chromatin regions of preserved and damaged regions of knee cartilage from eight OA patients were mapped using the assay for transposase-accessible chromatin using sequencing (ATACseq; 78 ). This is a fast and sensitive alternative method to DHS mapping that requires 5,000 to 50,000 cells, making analysis of cartilage and other primary joint tissues feasible. Of the 109,215 accessible chromatin regions identified in this study (available at https://www.nature.com/articles/s41598-018-33779-z#MOESM1), 71.1% mapped to putative enhancer regions, and 4% had altered accessibility between damaged and undamaged OA chondrocytes. DNA variants within several established OA loci map to these ATACseq peaks, as do the variants within 28 of the 56 new OA loci, including LTBP1, LTBP3 and SBNO1 (Table I).Collectively, these chromatin datasets will be an invaluable resource for identifying non-coding regulatory regions in cartilage, and will aid prioritisation of OA risk variants and epigenetic changes for functional analyses.

Summary
Within the last year, the number of OA genetic risk loci has increased from 34 to 90 28,29 , with several OA loci also associated with height, hip shape, DDH and bone area 28,36,37,39 . However, functional studies are required to elucidate the molecular mechanism whereby these variants increase OA risk. Integrating genetic variants with genome-wide datasets of cartilage AI 52 , chromatin states 57 and open chromatin regions 77 will inform functional analyses of these loci. RNA sequencing studies have characterised new chondrocyte subtypes 60 and systematically identified all miRNAs 53 , lncRNAs 54 and circRNAs 73 present in OA cartilage. The number of transcriptomic and epigenomic studies will continue to increase as sequencing costs fall and technical difficulties are overcome. As well as analysing cartilage from different joints at numerous time points in development, adulthood and during OA disease progression, future studies will hopefully examine additional diseaserelevant tissues such as bone, synovium and fat pad.

Author contributions
Dr Louise Reynard searched the literature, summarised the results and wrote the manuscript, with Dr Matthew Barter contributing to the section on non-coding RNAs.

Conflict of interest
We have no conflicts of interest.

Role of funding source
No specific funding was obtained for this paper. Dr Barter is funded by the Dunhill Medical Trust (grant reference R476/0516).