Introduction

Prostate cancer (PCa) has been shown to have a different course and essence in people of African ancestry. Not only is the transformation of the benign form to metastatic disease more precipitous, but tumor size is also reportedly larger, and African men present with more advanced disease1,2. PCa risk and disease course has also been associated with both rare and common African-specific inherited3,4, as well as cancer driver variants5. Additionally, several socioeconomic and environmental factors have been shown to contribute to higher PCa incidence and aggressiveness among African American3 and African continental populations6. While African American men are less likely to seek treatment for prostate-related disease, which has been contributed to a lack of insurance coverage, financial barriers and poor health-seeking attitudes2,7, dietary factors have also been considered2. It is therefore believed that the advanced form of the disease in African ancestral people stems from a combination of both genetic and non-genetic factors8.

Notably, PCa research pertaining to men of African ancestry has largely been driven out of the United States, with a notable lack of data from Sub-Saharan Africa. We have previously reported that Black men from South Africa are at 2.1-fold increased risk, after adjusting for age, to present with advanced PCa than African Americans from the National Cancer Institute (NCI) Surveillance, Epidemiology, and End Results (SEER) Program9. Within South Africa, this risk increased 1.6-fold for men living in subsistence farming over urban localities. Notably, the mortality rate of PCa in southern Africa is 2.7 times higher than the global estimates, with incidence rates on the increase due to improved screening and diagnosis10. However, factors mediating this pattern are still largely unknown, with a scarcity of studies on locally relevant lifestyle, demographic and environmental risk factors.

While it is well established that PCa risk and aggressive disease presentation is associated with African ancestry, what is unknown is if this contribution spans the spectrum of the genetically and culturally diverse African diaspora. The Southern African Prostate Cancer Study (SAPCS), recruiting within the borders of South Africa, provides a unique and currently unmet opportunity to address African-specific diversity, including four ancestrally (genetically and ethno-linguistically) distinct southern African populations, against non-African or ancestrally admixed South Africans. Interrogating a cohort of 1,387 southern African men diagnosed clinicopathologically either with or without PCa, we aimed to investigate the link between the rich southern African ancestral diversity and sociodemographic and environmental factors associated with PCa occurrence and advanced disease presentation.

Materials and methods

Study design

Initiated in 2008 in South Africa, the Southern African Prostate Cancer Study (SAPCS) aimed to identifying the genetic and non-genetic factors contributing to aggressive disease presentation across the region and associated ancestral disparity6. Irrespective of genetic ancestry or cultural heritage, study participants were recruited at one of multiple participating urology clinics located within the provinces of Limpopo including Polokwane Hospital or Tshilizini Hospital, and Gauteng including Steve Biko Hospital, Kalafong Hospital, Dr George Mukhari Academic Hospital between 2013 and 2019. The most common presenting complaints included urinary tract infections or erectile dysfunction. Patients with a recorded Gleason score (International Society of Urological Pathology (ISUP) group grading) or other records such as a biopsy indicating PCa, while radical prostatectomy and/or presence of metastasis were considered as cases and those with a definite diagnosis other than PCa or a negative biopsy were considered as controls. Patients whose PCa status could not be definitively determined were categorized as “unknown”.

Study population

Excluding for non-South Africans, all men included in the study self-identified as representing one or more ethno-linguistic classifier. While data was collected for two generations, parental ethno-linguistic identifiers were used for participant classifications, with southern African identifiers including Ndebele, Pedi, Shangaan, Sotho (Northern), Swazi, Tswana, Tsonga, Venda, Xhosa, and Zulu (minus their ethnic or linguistic prefixes). Further classification using Guthrie classification of the Southern Bantu or zone S languages includes S20-Venda (Tshivenda speakers), S30-Sotho-Tswana (Sesotho, Sepedi, Setswana), S40-Nguni (isiNdebele, isiXhosa, isiZulu, siSwazi), and S50-Tsonga (Xitsonga, including in our study ethnically reported Shangaan). We have previously demonstrated that the Southern Bantu peoples have a distinct Bantu ancestral genetic heritage, which includes varied contributions of ancient KhoeSan contribution11,12. Non-Bantu speakers who self-identified as either Afrikaner, German, English South African, or White South African were classified in this study as European, while those who self-identified as Baster, Cape Coloured, Cape Malay or Indian were classified in our study as Admixed/Asian. We have previously published on the genetic ancestral substructure of the Baster of Namibia and the Cape Coloured of South Africa, which spanning over 12 generations (30 years a generation) has a historical link to European and Indian/south Asian ancestry through the Dutch East Indian settlement if the Cape, together with ancient KhoeSan and to a lesser degree Bantu genetic fractions12.

Ancestral fractions

To further clarify the genetic ancestral fractions for the 780 Black South African participants, we performed ADMIXTURE v1.3.0 plot analysis13 using 33,410 single nucleotide polymorphisms (SNPs) pruned for linkage disequilibrium (LD), as previously described14 The ADMIXTURE analysis was conducted for K = 4, replicated in 3 of 10 runs, with a cross-validation error of 0.369. Ancestry proportions were plotted using the tool pong v1.5, with a greedy approach set to 0.9515.

Sociodemographic data

Following the urology visit and evaluation, the sociodemographic data such as ethno-linguistic heritage were collected using a questionnaire. While patients were recruited within the Provinces of Limpopo and Gauteng, for each patient place of birth and previous (longer than 5 years) and current place of residence was recorded, observing a distribution of men having resided within all nine provinces of South Africa. Using the longest time of provincial residence, men were further classified according to poverty rate16 and included Eastern Cape and Limpopo, as high poverty rate; KwaZulu Natal, Mpumalanga and North West, as medium poverty rate; and Free State, Gauteng, Northern Cape and Western Cape, as low poverty rate. Provincial residence was also used to determine likely dependence on subsistence farming17, including Eastern Cape, Limpopo, North West and KwaZulu Natal. Occupations where lifting of no more than 10 pounds was required and involved sitting most of the times were listed as sedentary18. Jobs categorized as outdoor included driver, agriculture/gardening, defence, construction/mining. The type of balding, family history of PCa, history of diabetes and sexually transmitted disease (STD), aspirin use, gynaecomastia and erectile dysfunction were documented based on patient’s medical records and double-checked by the patient through trained SAPCS healthcare worker interviews. Traditional healers visit, having hairy chest and red meat consumptions were asked directly from the patients.

Statistical analysis

Following defining the prevalence of the clinical and sociodemographic variables, we tested the association of each variable with PCa using logistic regression models and then adjusted for all the study variables (age, ethnicity, PCa family history, sexually transmitted disease history, red meat consumption, aspirin use, gynaecomastia, subsistence farming, poverty rate, traditional healers and occupation). This approach was taken to ensure that potential confounding factors were adequately addressed. To address the missing data on PCa status, the unknown group was included in the model once as cases and once as controls, allowing for a descriptive sensitivity analysis. This analysis explored how different assumptions regarding the unknown data could impact our findings.

In the analyses we had two categorizations for ethnicity. After incorporating all ethno-linguistic sub-categories, patients were then categorized as either Black South African or belonging to other groups, and the analysis was subsequently repeated. We further used logistic regression models to perform case only analysis for Gleason score and age at diagnosis. Age at diagnosis was also examined using an ordinal logistic regression. Lastly, we performed interaction term analysis with main effect between Black ethnicity and other variables to observe the differences of association with advanced PCa among Black South Africans and other ethnicities.

Ethics statement

This study was approved by the research committee the University of Pretoria Faculty of Health Sciences Human Research Ethics Committee (HREC #43/2010, with US Federal wide assurance FWA00002567 and IRB00002235 IORG0001762) in South Africa, and as initially approved by the Department of Health and Social Development, Limpopo Provincial Government Ethics Committee (#001/2008) and University of Limpopo Medunsa Research and Ethics Committee (#MREC/H/28/2009). Further data interrogation performed under approval granted by the St. Vincent’s Sydney HREC (#SVH/15/227) in Australia. Informed consent was signed by all the individuals. Additional approval and review for the study was provided by the U.S. Army Medical Research and Materiel Command (USAMRDC), Office of Human and Animal Research Oversight (OHARO), Office of Human Research Oversight (OHRO), E02371. All experiments were performed in accordance with relevant guidelines and regulations.

Results

Of the 1387 study participants visiting a contributing SAPCS urology clinic, 741 (53.4%) and 505 (36.4%) were identified as PCa cases and controls, respectively, while PCa status was undetermined for 141(10.2%) participants (Table 1). Among cases and controls 78.1% and 75.8% represented Black South African, 8.2% and 8.3% admixed/Asian and, 9.2% and 10.5% European ancestries, respectively, which is highly reflective of the population distributions across the country. Further determination of population substructure within the 780 SAPCS Black South Africans using ADMIXTURE analysis, concurring the existence of between ethno-linguistic genetic diversity (Fig. 1), which defines a predominant Venda (yellow), Tsonga (aqua), Nguni (red) and Sotho-Tswana (blue) genetic contributions and therefore within Southern Bantu population substructure. Furthermore, the age distribution across the cohort ranged from 40 to 107 (mean = 68.8). It was therefore not surprising that our study was biased towards pensioners, representing 74.0% and 70.7% of cases and controls, respectively. A total of 250 (33.7%) of the cases presented with a Gleason score of 8 or higher (ISUP ≥ 4).

Table 1 Sociodemographic and clinical data of the southern African study participants.
Figure 1
figure 1

ADMIXTURE plot analysis for a representation of 780 Black South African men from the Southern African Prostate Cancer (SAPCS) for K = 4 (replicated in 3 of 10 runs, cross-validation error of 0.369), demonstrating a unique predominant genetic fraction distinguishing the African ancestral ethno-linguistic groups defined as Venda (yellow), Tsonga (light blue), Nguni (dark blue) and Sotho-Tswana (red).

Case control and sensitivity analysis

Given that we could not attain the PCa status of 141 participants, we have initially performed a complete case analysis (CCA). The CCA revealed that other than being a Nguni speaker (OR = 1.63, 95% CI 1.02–2.63), age of older than 75 years, erectile dysfunction, gynaecomastia, as well as vertex and complete balding pattern were associated with PCa (Fig. 2). The model was then adjusted for other variables, and it was shown that Black South Africans tend to be more likely than other ethnicities to be diagnosed with PCa (OR = 1.44, 95% CI 1.05–2.00). More specifically, Nguni, Tsonga, Venda and Admixed groups were more likely than those with a European ethnicity to be diagnosed with PCa. Other factors i.e. age of older than 75 years, history of STD, gynaecomastia and complete balding pattern were also associated with being diagnosed with PCa in this model. Performing a sensitivity analysis and assuming unknown individuals to be PCa cases didn’t change any significant results except for aspirin use (OR = 0.79, 95% CI 0.64–0.99). However, most of the variables including the ethno-linguistic variables lost the statistical significance assuming unknown individuals as controls (Table 1S).

Figure 2
figure 2

Crude (orange) and adjusted (blue) associations of study variables with the risk of prostate cancer using a logistic regression model for 741 southern African cases and 505 controls. Adjusted associations include age, ethnicity, PCa family history, sexually transmitted disease history, red meat consumption, aspirin use, gynaecomastia, subsistence farming, poverty rate, traditional healers and occupation. The X axis is based on odds ratios.

Grade analysis

Black South Africans were more likely than all other groups to be diagnosed with an advanced disease (ISUP ≥ 4: OR = 2.25, 95% CI 1.49–3.40 and ISUP ≥ 3: OR = 2.02, 95% CI 1.41–2.90). In addition, even though all Black ethno-linguistic groups were more likely than Europeans to have advanced grade, this relation was significant only for the Tsonga people (ISUP ≥ 4: OR = 3.43, 95% CI 1.62–7.27). Residing in provinces with high poverty rates was also associated with advanced PCa grade presentation (ISUP ≥ 4: OR = 1.51, 95% CI 1.07–2.13). After adjusting with other variables, however still significant, the association of Black ethnicities and Tsonga people with ISUP ≥ 4 was moderated. Additionally, we found the Admixed/Asian less likely to be diagnosed with ISUP ≥ 3 compared with Europeans. People who had red meat consumption were also less likely to be diagnosed with advanced PCa (Fig. 3).

Figure 3
figure 3

Association of study variables with ISUP ≥ 4 (A) and ISUP ≥ 3 (B) in unadjusted (orange) and adjusted (blue) logistic regression models for 716 southern African cases. Adjusted for age, ethnicity, PCa family history, sexually transmitted disease history, red meat consumption, aspirin use, gynaecomastia, subsistence farming, poverty rate, traditional healers and occupation. The X axis is based on odds ratios, OR odds ratio, CI confidence interval.

Age at diagnosis

A logistic regression model showed that Black South African people were more likely than non-Africans to be diagnosed with PCa at an age greater than 59 years (OR = 1.64, 95% CI 1.04–2.60). In the ordinal logistic regression not only Black ethnicity, but all Black ethno-linguistic groups were associated with older age at diagnosis compared with Europeans. Those with a family history of PCa tended to be diagnosed with PCa at a younger age (Table 2S).

Ethnicity

A positive family history of PCa was more likely to be found among non-Blacks than Black South Africans both in cases and the overall study population. Black people were more likely to live in provinces with subsistence farming or higher poverty rates, seek medical advice from traditional healers over first-call western practitioners, consume red meat, have gynaecomastia or a hairy chest, and report a previous outdoor job (Table 3S). Interaction term analysis showed that the pattern of association of age with advanced PCa (ISUP ≥ 4) is significantly different between Black South Africans and other ethnicities. Specifically, men older than 75 years of age were less likely to be diagnosed with advanced disease if they were of from Black South African group (OR = 0.21, 95% CI 0.04–0.98, Table 4S). The same pattern was observed in the association of red meat consumption (ISUP ≥ 4 and ISUP ≥ 3) and history of STD (ISUP ≥ 3) with advanced PCa. This shows that Black southern Africans had a significantly increased risk for the advanced disease compared to other ethnicities if they had a history of STD.

Discussion

As no definite treatment exists for the metastatic PCa9, it is critical that PCa is detected early and pre-metastasis. As such, identifying risk factors for PCa and the advanced form of the disease in the region is an important step to decrease its burden on the health care system. The incidence of PCa in South Africa has tripled in the last 15 years, which has largely been attributed to improvements in diagnosis19. However, little is known about the tumor characteristics and the possible factors for this high incidence. Here we assessed for risk factors of PCa and its aggressiveness among the ancestries of southern African men.

Concurring with previous studies showing that Black South Africans are more likely to be diagnosed with PCa and with advanced disease20, including compared with African Americans6, we advocate for earlier age PSA screening (around 45 years) in the southern African setting, as suggested for African American men21. The association of the advanced disease with African over non-African ancestry remained statistically significant despite adjusting for known PCa risk variables, including age. Other than the genetic factors, an explanation for this association is that reportedly, only 9.9% of Black South Africans have private health insurance and are therefore reliant on often over-crowded and under-resourced public healthcare services; while 72.9% of Europeans, 52% of Indians and 17.1% of South African Coloured report having private health insurance22. Additionally, a recent study in South Africa of 341 PCa cases reported that only 76 (22.3%) had awareness of PCa before diagnosis, with less than 50% of cases seeking medical help after PCa diagnosis23, and as such we call for further programs focused on bringing education and awareness across the region.

However, identifying as a Black South African represents a rich ethno-linguistic and as such genetic and cultural diversity24,25, calling for caution in singularizing African ancestry within the region. This was highlighted in a 2017 study that showed the occurrence of malignancies to vary across different east African population identifiers26. Population classifiers aid in capturing the nuanced genetic variations and susceptibility to diseases, among different African subgroups. Using a smaller study population, we have previously shown a marginally increased PCa risk associated with the Venda Nation9. Here, through self-reported ethno-linguistic identification, we show the Nguni people to be statistically significantly more likely than Europeans to be diagnosed with PCa. Appreciating the limited number of cases, after adjusting the model for study variables we found Tsonga and Venda ethnicities were also associated with PCa. Furthermore, we found the Tsonga people were more likely to present with advanced form of the disease, after adjusting for age, providing for the first time within regional insights. Assuming genetic risk, we previously alluded to a differential ancestral Bantu fraction within the Tsonga versus Sotho-Tswana, while more closely reflecting the Venda peoples, while excluding for Nguni speakers in this initial analysis6. Through genetic population substructure analyses for the 780 Black South Africans self-identifying as Nguni, Sotho-Tswana, Tsonga or Venda speakers, we demonstrate unique population ancestral identifiers, with the Tsonga and Venda peoples representing more recent shared ancestral fractions. Besides shared genetics, the Tsonga in our study were largely recruited from the malaria endemic region of South Africa. Coinhabited by the Venda peoples, we have previously speculated on the potential impact of annual dichlorodiphenyltrichloroethane (DDT) spraying since the mid 1950s and associated with urogenital malformations in newborn Venda boys’ exposed in utero27,28 on PCa risk and aggressive disease9. Another possible explanation might be the frequent use of the medicinal plant “Xidomeja” (J. Zeyheri) by the Tsonga, which has been reported to contain diterpenoid used in synthetic vitamin E29. Notably, men using these supplements tend to be diagnosed with high-grade PCa30,31.

Besides African-specific ethno-linguistic identifiers, concurring with previous studies we associate gynaecomastia, erectile dysfunction and STDs with PCa risk32,33. While STDs were previously not associated with PCa risk in our smaller SAPCS study, gynaecomastia was associated with aggressive disease presentation, and erectile dysfunction associated with increased PCa risk and aggressive disease, including earlier onset of well differentiated tumours9. Aware that gynaecomastia and erectile dysfunction association can be due to increased patient age or PCa treatment32, notably, all SAPCS study participants were treatment naïve at time of recruitment, while the association of erectile dysfunction with PCa was no longer observed after adjusting for age. It is therefore highly likely that the older age of the cases is driving the positive correlation with erectile dysfunction. One must, however, caution that the controls in this study cannot be regarded as “healthy control” as most of them were elderly men with urological symptoms such as enlarged prostate or cystitis.

Additionally, we associate a high poverty rate with advanced disease. It is well-established that people with a lower income are less likely to use medical services34. Red meat consumption can be an indicative of better economic status and was inversely associated with the advanced disease. While in most African cultures red meat was consumed as part of ceremonial celebrations or coming together of families and communities, which contrasts with western cultures, specifically within South Africa where red meat is consumed daily35. Complete balding patterns was also inversely associated with advanced PCa, which although arguably converse to the inconsistent European-biased studies36, concurs with previous observations for Southern African men, including a decreased risk for advanced PCa further associated with younger age of balding9. While some testosterone inhibitors such as finasteride are used for curing baldness in men which also tend to lower PCa and its mortality rate37, this is unlikely to be commonly used within a less affluent study cohort. Age is usually the predictor of advanced-grade cancer where there is a national screening program for a specific age group38,39, hence associations between age and the advanced disease was not expected in this study.

Unlike many malignancies, PCa is usually a slow-progressing disease and as such it can be obscure for a long time before its diagnosis40. Black South Africans were diagnosed at an older age in this study. This again might reflect the poor health seeking attitude, reliance on traditional methods of health care, lack of screening and insurance coverage which causes a delay in PCa diagnosis in this population. However, this delay in diagnosis did not seem to be the cause of the advanced disease among Africans since the associations were still significant after adjusting for age. People with a positive family history of PCa being diagnosed at a younger age showed that there was probably a wariness in this population about PCa. In addition, PCa with a pathogenic genetic variant usually occurs at a younger age41. Some of the factors associated with an old-age diagnosis of PCa such as residing in subsistence farming areas are likely to reflect poor socioeconomic status. Since it is well-established that Black ethnicity is a risk factor for PCa2, the observations that Black South Africans are less likely to have a PCa family history, is suggestive that many Black South Africans remain undiagnosed.

The interaction term analysis was used to determine whether any of the study variables have different associations with advanced disease in Black southern Africans compared with non-African ethnicities. Red meat consumption was significantly a stronger predictor of advanced PCa in Black South Africans compared with others. This may stem from the fact that men of European ancestry are more likely to voluntarily choose vegetarianism, while red meat consumption better reflects the economic status for Black South Africans42,43. STDs also put Black men at greater risk for presenting with ISUP > 3 disease. In other words, a Black South African man is significantly more likely to be diagnosed with ISUP > 3 if he has an STD than a non-Black man with STD. This shows that in addition to screening for PCa, awareness should be raised among the Black community to preserve their sexual health. Age of older than 75 years was also a less decisive factor for being diagnosed with the advanced disease for African men, with ISUP group grades more akin to people younger than 60 years if they were Black South Africans.

The main strength of this study was the broad coverage of PCa cases over two provinces of South Africa and investigating some novel factors within arguably genetically one of the most diverse populations globally. Considering the generalizability of our findings, it’s important to note that our study sample, defined by specific inclusion criteria, may not perfectly mirror the broader population. We did exclude participants with unknown PCa status, but this group represented a small fraction of our sample. While our results remain robust and consistent, sensitivity analyses indicate that the associations are not solely dependent on our sample composition. As such, the inclusion of the sensitivity analysis for unknown cases provided some reassuring known PCa significances, for example Aspirin use reducing PCa risk44 and presence of STDs increasing risk45. While arguably under-powered compared to European-biased epidemiological studies, as the world recognises the importance for inclusivity and equity, specifically with regards to under-representation across the African diaspora, this study provides important insights as the largest regionally defined Sub-Saharan PCa study of its kind to date.

Conclusion

In the era of personalized medicine, epidemiological factors often receive less attention than molecular factors and tumor characteristics. Here, we have shown that the Black South African ethno-linguistic identifier is associated with PCa and aggressiveness at diagnosis, regardless of other environmental factors, specifically and novel to this study highlighting increased risk for advanced disease in men from the Tsonga Nation. Additionally, we found poverty rate to be a decisive factor in being diagnosed with advanced PCa and delayed diagnosis. Our results confirms that men with an African ancestry should be encouraged to undergo earlier-aged PCa screening, with emphasis on preservation of sexual health and implementation of programs focused on PCa education and awareness. Complementary and parallel to genetic risk association studies focused on PCa health disparities, it is paramount that contributing non-genetic epidemiological risk factors are interrogated, especially as one considers the rich demographic, lifestyle and environmental diversity across the greater regions of Sub-Saharan Africa. The results of our investigation into risk factors within the African population provide a deeper understanding of the universality of certain risk factors associated with prostate cancer. This highlights the pressing need for greater inclusivity and equity in future prostate cancer studies.