Skip to main content
Have a personal or library account? Click to login
Genome-wide assessment of genetic diversity, population structure and selection signatures in Polish mountain sheep breeds using SNP microarrays* Cover

Genome-wide assessment of genetic diversity, population structure and selection signatures in Polish mountain sheep breeds using SNP microarrays*

Open Access
|Jun 2026

Full Article

The tradition of mountain sheepherding in Poland has a centuries-old history. Mountain sheep are an integral element of the pastoral economy, not only being the basis of subsistence for the local community of the Polish Carpathians, but also because of their role in highlander culture and shaping the mountain landscape. They provide milk for the production of traditional products, wool, leather, and delicious meat. The breeds are perfectly adapted to the harsh climatic conditions and terrain of the mountains. Mountain sheep currently include three breeds: the Podhale Zackel, the Polish Mountain Sheep, and the Colored Mountain Sheep.

The sheep of the Polish Carpathians descend from the Vlach Zackel, which the Vlachs brought with them in the 14th century (Kłapyta, 2014), leaving a lasting mark on the economic and cultural life of the inhabitants of the Polish Carpathians. From the Vlachs, the Podhale people adopted almost everything related to mountain pastoralism: the system of organizing grazing, along with all the associated rituals, the technology of processing sheep's milk and the accompanying nomenclature (Figure 1). The Podhale sheep had a white, mixed woolly coat, with dark spots on the muzzle, ears, and around the eyes. Black and dark brown individuals also occurred in the population. These sheep were used for a wide range of purposes, with the wool being used for the production of cloth, felt, and outerwear, while the milk was processed into traditional cheeses during their grazing in mountain pastures (Kawęcka and Krupiński, 2014). Several attempts were made to improve the sheep's utility by introducing various sheep breeds. After World War II, the Podhale Zackel was refined using Transylvanian Zackel imported from Romania and Friesian rams. As a result, sheep with increased body mass, higher milk yield, and a modified coat character were obtained. The wool became finer, its yield doubled, and the outer coat hair lengthened at the expense of the medullary hair. The refinement process resulted in a new breed, and the entire group of mountain sheep, including the Zackel, was named the Polish Mountain Sheep.

Figure 1.

The origin of three mountain sheep breeds in Poland from the original Vlach Zackel (photos by W. Puchalski, M. Pasternak-Chorosz, A. Kawęcka)

The idea of preserving biodiversity and valuable resources of native breeds, each with its unique characteristics, has changed the direction of breeding efforts in mountain sheep farming. This was driven by the characteristics of this native population, such as excellent adaptation to local conditions, disease resistance, and a strong herding instinct, so important in traditional grazing. Work has been undertaken to restore the original Podhale Zackel breed by selecting individuals phenotypically consistent with the breed standard from the existing Polish mountain sheep population. Work on separating the colored sheep from the Polish mountain sheep population began 25 years ago, by selecting individuals from Polish mountain sheep flocks based on their exterior, color, and coat characteristics (Polak et al., 2020). Given the high utility value of mountain sheep, their excellent adaptation to the environmental conditions of the Polish Carpathians, and their special importance for the landscape and heritage of these regions, this breed has been included in a genetic resources conservation program, as an expression of special concern for their preservation for future generations. Currently, 17 native breeds in Poland are included in the conservation program (Kawęcka et al., 2022).

An important element of the conservation and management of farm animal genetic resources is the estimation of diversity and analysis of the genetic structure of populations. Among the most important molecular tools used in farm animal research in recent decades are mitochondrial haplotype analysis, microsatellite sequence analysis, and single-nucleotide polymorphism (SNP) analysis. Currently, BeadChip microarrays (Illumina) are in widespread use for various animal species (Kawęcka et al., 2016). Recent genomic studies in sheep using similar approaches have provided valuable insights into genetic diversity, population structure, and selection patterns (Talebi et al., 2025). The aim of this study was to characterize genome-wide genetic diversity, population structure and signatures of selection in three Polish mountain sheep breeds (Podhale Zackel, Polish Mountain Sheep and Colored Mountain Sheep) using SNP microarray data, and to evaluate their implications for genetic resource conservation.

Material and methods

The study included 300 ewes belonging to three native breeds of mountain sheep: the Podhale Zackel (PZ), the Polish Mountain Sheep (PMS), and the Colored Mountain Sheep (CMS). The experimental material consisted of blood collected from randomly selected ewes from two different flocks in each breed group (Figure 2):

  • PZ – 50 samples in Pieniny Spiskie and 50 samples in Bieszczady Mountains,

  • PMS – 38 samples in Pieniny Spiskie and 65 samples in Pieniny Czorsztyńskie,

  • CMS – 50 samples in Pieniny Spiskie and 47 samples in Podhale.

Figure 2.

Map of the Polish Carpathians with the locations where the material was collected from the examined sheep

Related individuals were not excluded before further analyses.

A 5 ml portion of blood was collected from the sheep in the morning before milking by a veterinarian, with the approval of the Ethical Committee (Local Ethical Committee No. II in Kraków – approval no. 1293/2016).

Genomic DNA was isolated from blood using the QuickGene DNA whole blood kit S (Kurabo, Tokyo, Japan) according to the manufacturer's protocol. After DNA concentration and quality control, samples were normalized and genotyped using the OvineSNP50 BeadChip microarray (Illumina, San Diego, CA, USA) and the HiSanSQ system (Illumina). Genotypes were determined using GenomeStudio software (Illumina). SNP filtering was performed before the actual calculations. The genotyped SNP positions were remapped to the ISCG Oar_v3.1 genome assembly. SNPs located on the Y chromosome and unmapped SNPs were removed. SNP markers with undetermined genomic positions, those with more than 10% missing genotypes, and monomorphic markers across all breeds (i.e., MAF <0.01) were excluded. Finally, SNP markers were tested for deviations from Hardy-Weinberg equilibrium in each breed, and as a consequence, markers that deviated very strongly from equilibrium (P<0.0001) were rejected, which may indicate errors in the genotyping process. The presented results are after filtering.

The indicators of genetic variation considered in this study were the distribution of minor allele frequencies (MAF), mean expected heterozygosity (HE) and mean observed heterozygosity (HO), as well as the mean value of the inbreeding coefficient FIS calculated according to Wright's method (Wright, 1949) according to the formula: FIS=1HOHE {F_{IS}} = 1 - {{{H_O}} \over {{H_E}}}

To visualize the differentiation between mountain sheep breeds and the variation within populations, PLINK v1.90b5.3 64-bit software (Purcell, 2007) was used to generate the first three components of a principal components analysis (PCA).

Effective population size was determined for all studied mountain sheep breeds based on linkage disequilibrium between markers using NeEstimatorv2 software (Do et al., 2014). To reduce computational time and to minimize the impact of SNP linkage, analyses were performed based on a panel of randomly selected 1,000 polymorphic SNPs for each studied breed. The narrow confidence intervals obtained with the Jackknife approach indicate that our NeN_eNe estimates are stable and reliable. Furthermore, the estimates are highly consistent with those reported in our previous study, where NeN_eNe was calculated using LD-based methods and the full SNP dataset (Jasielczuk et al., 2023).

To identify genomic regions showing signs of differential selection between breeds, Wright's F-statistic (FST) analyses were performed, assessing the inbreeding parameters of a given subpopulation compared to the entire population (distance between populations), using PLINK v1.90b5.3. For each SNP, the FST value was calculated, obtained by pairwise comparisons between breeds, i.e., PZvsPMS, PZvsCMS, and PMSvsCMS (Wright, 1949; Akey et al., 2002). The FST value was then standardized according to methodology of Wright (1949) and Akey et al. (2002), for each breed using the formula: di=FSTijEFSTijsdFSTij {d_i} = {{F_{ST}^{ij} - E\left[ {F_{ST}^{ij}} \right]} \over {{\rm{sd}}\left[ {F_{ST}^{ij}} \right]}} where:

  • EFSTij E\left[ {F_{ST}^{ij}} \right] – average FST,

  • sdFSTij {\rm{sd}}\left[ {F_{ST}^{ij}} \right] – standard deviation for FST.

0.1% top di values were considered as significant signals of differential selection.

Quality-filtered genotypes were additionally subjected to phasing using default parameters in FASTPHASE software (Scheet and Stephens, 2006). Next, for each breed, positive selection signatures were identified using the REHH test using SWEEP v.1.1 software (Sabeti et al., 2002). This procedure yielded a dataset that was then analyzed by calculating the number of haplotype blocks for which haplotype frequencies were greater than 0.25 and for which the probability of the statistical test for REHH was lower than the significance threshold of P = 0.05. All identified genomic regions potentially being under selection pressure were analyzed for their distribution in the genome, the genes they encode, and their functional significance for the selected traits. To this end, we first used the Ensembl database (https://genome.ucsc.edu/cgi-bin/hgTables) to identify genes encoded by the indicated regions. Subsequently, the resulting genes were analyzed for their function using the online bioinformatics tool DAVID Bioinformatics Resources 6.8, NIAID/NIH. This tool also allowed us to associate the annotated genes with their corresponding metabolic pathways and to test the statistical significance of the enrichment of individual pathways by the detected genes (Huang et al., 2009).

Results

A total of 47,592 SNP markers were used for the analyses. Missing genotypes were observed for 3,281 SNPs, with an average missing genotype count of 0.0198. The mean minor allele frequencies (MAF) for the whole Mountain Sheep group was 0.300 (±0.124), with the highest mean value observed on chromosome 21 (0.309±0.121). The distribution of the minor allele frequencies for Mountain Sheep is presented in a histogram (Figure 3). The highest mean frequency of the minor allele was observed for the Podhale Zackel (0.302), while for the PMS and CMS, these values were lower and similar (0.297 and 0.298, respectively). The highest values of mean expected and observed heterozygosity were noted for the Podhale sheep, while the FIS value for all mountain sheep breeds was below zero.

Figure 3.

Frequency distribution of the rarer allele for tested sheep

The mean expected heterozygosity of the marker panel was 0.390 (±0.109), and the mean observed heterozygosity was 0.388 (±0.111).

Significant (P<0.05) deviations from Hardy-Weinberg equilibrium were observed for 2,749 markers, and 718 markers showed highly significant (P<0.01) deviations. The highest number of markers significantly and highly significantly deviating from the Hardy-Weinberg equilibrium was observed for PMS (2207 and 479 markers, respectively). The mean values of the genetic variation indices for the breeds are summarized in Table 1.

Table 1.

Genetic variability indexes for the Mountain Sheep subpopulations

BreedMAFHOHEFISHWE P<0.05HWE P<0.01
PZ0.302 (±0.122)0.397 (±0.116)0.392 (±0.107)−0.01251696322
PMS0.297 (±0.124)0.391 (±0.119)0.386 (±0.110)−0.01222207479
CMS0.298 (±0.124)0.393 (±0.119)0.388 (±0.110)−0.01371790346
Mean to all breeds 0.2990.3940.389−0.01281898401

Effective population size was determined for the sheep breeds studied, and significant variation was observed among them. Individual breeds had effective population sizes ranging from 64 to 103 individuals. The results are summarized in Table 2.

Table 2.

Effective population size (Ne) and 95% confidence interval of Ne for the studied sheep breeds

BreedNe95% CI down95% CI up
PZ94.292.795.7
PMS103101.3104.7
CMS64.663.865.4

Principal component analysis (PCA) was used to graphically illustrate the interbreed and intrabreed variability of Mountain Sheep. For each individual, the values of the first three components of variability were determined (PC1, PC2, and PC3).

Figure 4 presents a visualization of the data for PC1 and PC2 in mountain sheep. It allows for the observation of three distinct clusters, representing the breed lines studied. The figure for the PC1 and PC3 data for mountain sheep, similarly to the first two components (PC1 and PC2), shows a group of poorly differentiated individuals in the middle section. In this layout, the internal genetic variability of the Polish mountain sheep group is even more pronounced.

Figure 2.

Visualization of genetic differentiation based on PCA - A - PC1 vs. PC2 components for mountain sheep; B - PC1 vs. PC3 components for mountain sheep

The highest values of mean and weighted mean genetic distance (FST) between breeds were recorded between PMS and CMS (0.0236 and 0.0254), and the lowest between PZ and PMS (0.0142 and 0.0150). The mean and weighted mean FST values are summarized in Table 3.

Table 3.

Average (above the diagonal) and weighted average (below the diagonal) FST distances among the studied breeds

BreedPZPMSCMS
PZ----.pdf"/>0.01423640.0197141
PMS0.014995------0.0236004
CMS0.02082610.0254135------

Based on the FST values for individual SNPs and the mean FST values for individual breed pairs, standardized FST values (di) were determined. In individual mountain sheep breeds, 11 to 17 genomic regions with strong FST-based selection signals were detected, ranging in size from 305.9 kbp to 1.5 Mbp. The most selection signals were detected in the Podhale Zackel, and the least in the Polish Mountain Sheep. The most signatures were noted on chromosome 2 in the Podhale Zackel and the Polish Mountain Sheep, while in the Colored Mountain Sheep on chromosome 14. No signals of differential selection were detected on chromosomes 10–12, 15, 17–22, and 25–28. Three signatures occupying common genomic regions were identified for the Podhale Zackel and the Polish Mountain Sheep, located on chromosomes 2, 3, and 16, and one signal located on chromosome 4, was common to all breeds. The visualization of selection signals in the form of Manhattan plots for the studied breeds is shown in Figure 5. The dashed line indicates the top 0.1% of di values in each breed. This corresponds to the top 0.1% of di values from the empirical distribution. This threshold was selected based on previous literature using the same cutoff (Pérez O'Brien et al., 2014; Purfield et al., 2017; Lukic et al., 2023) and was additionally motivated by the need to obtain a number of significant results that can be reasonably interpreted and analyzed.

Figure 5.

Di values in all analyzed autosomes of mountain sheep

Annotation of the obtained regions under selection using the Ensembl database identified 92 genes for the Podhale Zackel, 90 for the Polish Mountain Sheep, and 100 for the Colored Mountain Sheep. From this pool, 56 genes with known function for ZN, 55 for PMS, and 59 for CMS were used to analyze metabolic pathways using the DAVID 6.8 online tool. Of these, two genes were common to PZ and PMS (C1D, DMRTA1), four were common to PZ and CMS (GNAQ, LONP2, N4BP1, SIAH1), and two were common to all three breeds (ISPD, SOSTDC1). Functional classification of well-annotated genes showed that genes detected in regions under selective pressure were enriched in processes related to cellular responses, regulation of kinase and enzyme activity, and ATP binding. The most statistically significantly enriched processes were observed for the Podhale Zackel. For the PZ and CMS, one process (ATP binding) and a group of three genes common to both breeds (ABCC11, ABCC12, and LONP2) were observed. No statistically significantly enriched processes were observed for the PMS.

It was also found that all breeds share a common selection signal located in a region of the chromosome 4. Annotation of this region in the Ensembl database identified two genes, namely ISPD (Isoprenoid synthase domain containing; Chr 4 Exon 10, ENSOARG00000008410, Sequence: NC_056057.1) and SOSTDC1 (Sclerostin domain containing 1; Chr 4 Exon 2, ENSOARG00000008418, Sequence: NC_056057.1).

Analyses conducted using Sweep v.1.1 software identified 1,708 (PZ), 1,901 (PMS), and 1,963 (CMS) core haplotype blocks (CRs), with an average length of 89.4 (58.5) kbp for PZ, 95.9 (127.2) for CMS and 96.3 (125.9) kpz for PMS. For all haplotype blocks, from 13,338 (PZ) to 15,752 (CMS) REHH tests were performed. Tests with haplotype frequencies lower than 0.25 were then removed from the results, resulting in 5,606 (PZ) to 6,501 (CMS) core haplotypes. From the remaining tests, those that were statistically significant, i.e., P<0.05 (representing the strongest selection signals), were selected. The resulting REHH p values were logarithmized and plotted against chromosomal position to visualize the core haplotypes and patterns of selection in the genome for each breed (summarized in Figure 6, where the dashed line represents the significance threshold of 0.05). There were between 279 (PZ) and 326 (CMS) significant haplotype cores (Table 4). After combining haplotypes spanning the same cores, 210 haplotype blocks remained for PZ, 215 for PMS, and 216 for CMS. The lengths of the haplotype cores, their numbers, and the other values described above are collected in the supplementary materials.

Figure 6.

Genomic map of REHH probability values for all core haplotypes showing frequencies >0.25 for mountain sheep. The green dashed line indicates a REHH probability level of 0.05

Table 4.

Characteristics of haplotype cores for PZ, PMS and CMS

SpecificationPZPMSCMS
Number of haplotype cores 170819011963
Average haplotype core length (SD)89.4 (58.5) kbp 95.9 (127.2) kpz 96.3 (125.9) kpz
Number of REHH tests 133381481815752
Number of haplotypes more frequent than 0.25560663296501
Number of cores for which haplotype frequency >0.25 and P<0.05279326321
Number of unique haplotypes 210215216

The genomic positions of statistically significant core haplotypes were used to identify genes potentially associated with the identified selection signals. The procedure was performed similarly to that for differential selection, using the ENSEMBL database for gene identification. This resulted in 197 genes for PZ, 329 for PMS, and 264 for CMS. The DAVID was then used to determine the functions of the identified genes. For the Podhale Zackel, the program identified 197 genes with known function, 329 genes for the Polish Mountain Sheep, and 156 genes for the Colored Mountain Sheep. In the Podhale Zackel, only one process controlled by a group of 17 genes was statistically significantly enriched, and it was related to ATP binding, similar to the results obtained from the FST analysis. For the Polish Mountain Sheep, five processes involving 22 genes were detected, while for the Colored Mountain Sheep, seven processes involving 47 different genes were identified. For the Podhale Zackel and the Colored Mountain Sheep, the same process was identified: ATP binding, for which two genes (PAN3 and NUBPL) were common to both breeds.

Discussion

Effective management of livestock genetic resources requires comprehensive knowledge of breeds, including data on population size and structure, geographic distribution, production environment, and genetic diversity within and between breeds. Combining different types of data allows for a more comprehensive assessment of biodiversity within and between breeds, thus facilitating effective resource management. The genetic structure of a population is described by a number of parameters, such as the number of alleles, heterozygosity, inbreeding, variation between populations, and so on. Populations may differ in one or more characteristics that describe their genetic structure. Genetic structure parameters can indicate the population's fitness and trends in change (Groeneveld et al., 2010).

Heterozygosity is an important indicator for assessing the genetic variability of domestic animals. Heterozygosity reflects the genetic potential and adaptability of a population to its natural environment (Zhang et al., 2018). Deniskova et al. (2018), analyzing the genotype of 25 Russian local breeds using the OvineSNP50 BeadChip microarray, found heterozygosity values ranging from 0.354 to 0.359, comparable to those observed in other sheep breeds worldwide. Ciani et al. (2014), using the OvineSNP50 BeadChip, analyzed the genetic structure of 19 Italian breeds. The authors found distinct breeds inhabiting Sardinia, which are a consequence of both sporadic introgressions of wild mouflon in ancient times and the island's long-term genetic isolation from mainland sheep. The authors presented arguments that challenge, from a genomic perspective, the current classification of Bergamasca and Biellese sheep into two distinct breeds. Expected heterozygosity in the Italian breeds ranged from 0.33 to 0.37. These values were consistent with those observed by Kijas et al. (2012) for southern and western European sheep, analyzed using the IlluminaOvineSNP50BeadChip. Their study reported observed heterozygosity of 0.33 to 0.38, with the exception of the Macarthur Merino, for which the value was 0.22. The authors suggest that, in sheep, breed heterozygosity revealed a weak association with increasing distance from the “center of domestication”. One likely explanation is the widespread use of Merino rams in Europe, which began after the Middle Ages, resulting in the widespread dissemination of haplotypes between Merino and other breeds. In our own studies, heterozygosity values for mountain sheep ranged from 0.393 to 0.397. These values were comparable to those obtained in European sheep and higher than those in Asian sheep (Zhang et al., 2018; Deniskova et al., 2018; Ciani et al., 2014; Kijas et al., 2012). As suggested by the authors of the studies cited above, this may indicate a higher level of domestication of Polish native mountain sheep breeds.

Abied et al. (2020) presented analyses conducted on five Chinese native sheep breeds. They used high-resolution 600K SNP microarrays. The calculations showed that the FIS inbreeding coefficient ranged from −0.03 for Karakul sheep to −0.004 for Wadi sheep. Our own research did not reveal a risk of increased inbreeding indices in any of the studied populations (FIS ranging from −0.0122 to −0.0137). This may indicate proper breeding practices and appropriate ram selection.

There is a close relationship between the inbreeding growth rate (ΔF) and the effective population size (Ne), as ΔF = 1/2Ne, where ΔF represents the inbreeding growth rate and Ne represents the effective population size (Filistowicz, 2015). The effective population size (Ne) is one of the most important parameters considered in analyzing the condition of a conserved population and assessing its threat status. In Germany, using only this indicator, the following thresholds were adopted for assessing the threat status of breeds: Ne ≤ 50 – only phenotypic protection is possible; 50 <Ne ≤ 200 – the population requires protective measures; 200 < Ne ≤ 1000 – the population requires monitoring, Ne > 1000 – the population is not threatened (Krupiński and Polak, 2018). The effective population size in the study by Deniskova et al. (2019) varied for individual sheep groups, ranging from Ne = 128 for the Kyrgyz wire-haired sheep to Ne = 660 for the Gissar sheep. The authors suggest that even the lowest values obtained were higher than in previous studies by our own and other authors, or consistent with observations reported by the SheepHapMap project (Kijas et al., 2012). They suspect that the larger effective population sizes likely correspond to the widespread and high popularity of these studied breeds in Central Asian countries. According to Deniskova et al. (2019), the low effective population size values of the Kyrgyz Alai and Tien-Shan breeds were likely due to a limited founder population, as both breeds were created and bred within scientifically developed breeding programs.

Preliminary results of molecular studies using SNP microarrays comparing effective population size between Polish breeds covered by genetic resources conservation programs were presented by Gurgul et al. (2019). They showed variation in this parameter between species. The highest mean Ne values were found in horses and cattle (97 and 94, respectively), relatively low in pigs (46), and intermediate in sheep (60). The results obtained in our study showed significant variation within the mountain sheep population itself, from the lowest Ne = 64.6 for the Colored Mountain Sheep to Ne = 103 for the Polish Mountain Sheep.

Visualization of genetic diversity using the PCA method, based on the genetic profiles of individuals, revealed a clear division of the studied population into three clusters, corresponding to separate breeds. The analysis revealed that all breeds had distinct genetic structures, and individuals within each breed clustered closely together. However, some individuals were found in the area overlapping for all three populations. For some individuals from each breed, significant similarity to other breeds was observed. This may indicate a close relationship between the studied individuals, currently belonging to separate breeds but historically originating from the same mountain sheep population. These data likely reflect the common phylogenetic origin of all three mountain sheep breeds. Based on the PCA results, it can also be concluded that the Podhale Zackel represents the most homogeneous genetic group, while the Colored Mountain Sheep represents the most internally diverse group.

Talebi et al. (2025) investigated the genomic consequences of recent, targeted crossbreeding in Moghani sheep by comparing purebred animals with Texel- and Booroola-derived paternal backcrosses, using genotyping-by-sequencing to assess population structure, genetic differentiation, and inbreeding. PCA clearly separated purebred Moghani sheep from all crossbred groups, indicating strong genomic restructuring due to introgression, while pairwise FST values (approximately 0.007–0.018) confirmed increased differentiation between pure and crossbred populations, particularly in Texel-derived lines. Although crossbred lambs showed higher numbers of polymorphic variants, they also exhibited longer runs of homozygosity and higher ROH-based inbreeding coefficients, indirectly suggesting reduced effective population sizes and potential loss of adaptive resilience. FST was interpreted together with ROH and heterozygosity to show that introgression did not uniformly increase genetic health, as higher differentiation coincided with increased genomic inbreeding in some crosses. In contrast, our study on native Polish mountain sheep breeds within a conservation framework using SNP microarrays, where PCA revealed three distinct but partially overlapping breed clusters reflecting shared historical ancestry rather than recent admixture, and FST values (approximately 0.014–0.025) indicated moderate differentiation typical of closely related breeds. Importantly, we directly estimated effective population size, obtaining Ne values between about 64 and 103, which, together with low inbreeding coefficients, supported the demographic stability of these conserved populations. Overall, while both studies applied similar population-genetic tools, Talebi et al. (2025) emphasized short-term genomic changes and trade-offs arising from crossbreeding for productivity, whereas we focused on long-term genetic structure, differentiation, and demographic viability in the context of breed conservation.

It is well known that selection increases the frequency of particular alleles at different genomic loci in a selected population and creates unique genetic patterns in their DNA sequence. Several methods are available to scan the genome for selective signatures by assessing patterns of variation within and between breeds. Among them, FST-based signatures and extended haplotype homozygosity (EHH) are a reliable approach for detecting genomic regions subject to relatively recent selective pressure (Bomba et al., 2015).

A comparative study of selection signatures in two species was conducted by Kim et al. (2016), who analyzed 50K BeadChip microarray data from indigenous sheep and goat populations from the hot and arid environment of Egypt, where natural selection played a major role. Genomic regions encompassing 119 genes were identified, most of which were responsible for metabolism. Notable signatures included genes influencing traits related to adaptation to unfavorable environmental conditions. Among the identified genes were genes responsible for thermoregulation (FGF2, PLCB1), digestion (MYH, TRHDE), body size (BMP2, GJB3), and immune response (IL2, GRIA1). The authors also observed shared signatures between sheep and goats, suggesting shared selection in these two species, which are ultimately closely related. Furthermore, sheep and goats are useful as model organisms for mapping selection signatures and genomic linkages.

Abied et al. (2020) analyzed population structure and selection signatures among five indigenous Chinese sheep breeds (using high-density 600K SNP genotypes) selected from different geographic locations with extremely dry or humid conditions. Genomic regions under selection identified by FST and XP EHH methods frequently overlapped across breeds and included genes associated with adaptation to extremely dry or humid environments, innate and adaptive immune responses, and traits related to growth, wool, milk, and reproduction.

In another previous work, Eydivandi et al. (2021) analyzed 34,206 SNP markers from 450 individuals of 15 sheep breeds, including six indigenous Middle Eastern sheep breeds. Using both FST and REHH statistical analysis, 629 regions and 256 genes were detected as being under selection, respectively. These regions encoded genes identified as influencing economic traits, namely CIDEA, HHATL, MGST1, FADS1, RTL1, and DGKG. Both FST and REHH identified 60 common genes as selection signatures. Among them, four candidate genes were identified: NT5E, ADA2, C8A, and C8B, which were enriched in two significant functional categories from the Gene Ontology (GO) database, related to adenosine metabolism. The authors emphasize the high value of the acquired knowledge about genomic regions subject to selection pressure, which will enable future insights into the role of the selected genes in local adaptation in the studied populations.

Using FST statistics, Gurgul et al. (2021) identified genomic loci with the greatest genetic divergence between seven analyzed breeds of Polish native sheep: Blackface, Colored Merino, Old Type Merino, Polish Mountain Sheep, Świniarka, Uhruska, and Wrzosówka. The identified genomic regions with the strongest selection signals encoded 781 distinct genes. The RXFP2 gene locus, responsible for horn size/type in sheep, was identified as the most prominent selective signal.

The results of Yurchenko et al. (2019) showed that the genomes of 15 Russian sheep breeds contain numerous regions likely subject to selection pressure. More than half of these regions contained well-known candidate genes related to morphology, adaptation, and domestication (KITLG, KIT, MITF, and MC1R), wool quality (DSG, DSC, and KRT), reproduction (CMTM6, HTRA1, GNAQ, UBQLN1, and IFT88), and milk production (ABCG2, SPP1, ACSS1, and ACSS2). Furthermore, the authors observed selection signatures in gene regions whose function in the human genome is associated with sensory neuropathies and genetic disorders affecting the inability to sense pain and temperature. This may indicate a possible mechanism for adaptation to harsh climatic conditions.

In the mountain sheep analyzed in this study, a region under selective pressure common to all breeds was detected. Among the genes located in this region are ISPD and SOSTDC1. Although the function of ISPD in mammals is unknown, mutations in this gene lead to reduced functional glycosylation of dystroglycan, which is a common cause of various forms of dystroglycanopathy (Cirak et al. 2013). Gouveia et al. (2017), in studies conducted on local Spanish sheep breeds, also detected the selection signal spanning the ISPD gene. They suggest a role for the ISPD gene in nervous system development. The SOSTDC1 gene, in turn, encodes a sclerostin-containing protein, an alternative protein for the differentiation of follicular helper cells and regulatory follicular T cells during acute viral infection (Hu et al. 2019). This allows us to assume that the presence of this gene in the selection signature region has a positive effect on resistance to viral infections.

Närhi et al. (2012) investigated the effect of the SOSTDC1 gene on modulating the hair follicle and mammary gland morphogenesis pathway in mice. Mice with an inactivated SOSTDC1 gene were used to investigate the role of this gene in modulating the bone morphogenetic protein (BMP) signaling pathway and the WNT family of proteins, which control the initiation and formation of skin appendages. They observed that the SOSTDC1 gene did not affect the hair growth cycle in the skin but limited the number of developing hair follicles and whiskers on the face of mice. Furthermore, this gene influences the size and shape of the mammary glands, and its knockout promotes the development of supernumerary mammary nipples. According to Närhi, SOSTDC1 can be hypothesized to attenuate signaling of the WNT/β-catenin pathway. Nie et al. (2018) conducted research on tissues collected from fetal Tibetan sheep to investigate the influence of long non-coding RNAs (lncRNAs) and mRNAs on the primary induction of wool follicles in sheep. RNA sequencing revealed signals in the WNT, BMP, and FGF (fibroblast growth factor) protein pathways. GO and KEGG analyses revealed enriched biological processes of morphogenesis and transforming growth factor-β, as well as intercellular matrix-receptor interactions. mRNA and lncRNA interaction networks revealed transcripts likely involved in the induction of primary wool follicles. Among the genes identified as hair follicle marker genes, EDAR, FOXI3, FGF20, SHH, and SOSTDC1 showed increased expression when primary wool follicles were initiated in sheep.

Findings and conclusions

This study characterized the genetic structure of Polish native sheep breeds, classified as mountain sheep, using SNP microarrays. It was found that the studied populations have distinct genetic structures, with a certain pool of individuals sharing a similar genetic profile. Native mountain sheep breeds were characterized by similar levels of heterozygosity. The lowest effective population size was observed in CMS, which is the least favorable from the point of view of genetic diversity. The highest genetic distance was observed in PMS vs. CMS, and the lowest in PZ vs. PMS. The low inbreeding coefficient indicates a lack of inbreeding risk. In genomic regions with strong signals of differential selection, 92 genes were identified for the Podhale Zackel, 90 for the Polish Mountain Sheep, and 100 for the Colored Mountain Sheep. Two genes (ISPD – nervous system development and SOSTDC1 – hair follicle morphogenesis) were found in the common selection signal on OAR 4 across the breeds. Overall, SNP microarrays are an effective tool for assessing genetic variation and structure, which is definitely useful in genetic resource conservation programs, and may help maintain biodiversity and manage native populations, also by conscious selection of individuals for breeding in order to preserve the desired characteristics, such as adaptation to unfavorable environmental conditions.

Nevertheless, it must be acknowledged that the conducted studies were subject to certain limitations, such as sample size and SNP panel density.

DOI: https://doi.org/10.2478/aoas-2026-0009 | Journal eISSN: 2300-8733 | Journal ISSN: 1642-3402
Language: English
Submitted on: Oct 27, 2025
Accepted on: Jan 5, 2026
Published on: Jun 4, 2026
In partnership with: Paradigm Publishing Services
Publication frequency: 4 issues per year

© 2026 Anna Miksza-Cybulska, Aldona Kawęcka, Artur Gurgul, published by National Research Institute of Animal Production
This work is licensed under the Creative Commons Attribution 4.0 License.

AHEAD OF PRINT