Evaluation of Informative SNPs in Iranian Azeri Population

New DNA analysis techniques have made advancements in human identification and forensic genetics. Using identification methods based on DNA analysis is necessary for identifying human-remains in mass disasters, wars, and sociopolitical events when personal factors or other physical traits are not informative.1 Using short tandem repeat (STR) markers is now a common method for identification in forensic tests. Nevertheless, they cannot be used for DNA that has been considerably degraded because the standard STRs have long repetitive regions and do not yield qualified results for degraded samples. Then, alternative methods must be used in such cases.1-4 One of them is genotyping mitochondrial DNA using short genetic markers. Although this method has advantages, the sequence diversity of these markers is lower in comparison with the STRs. Also, since inheritance of these markers is maternal, finding some relationships such as those between fathers and daughters is difficult which limits their capability.2,3 Another alternative method is to shorten fragment length by redesigning currently known STR primers that are generally called mini-STRs.3,4 Next common alternative marker in forensic tests is single nucleotide polymorphisms (SNPs) that has advantages over other markers.4,5 However, the prerequisite to using SNP genotyping for forensic purposes is to obtain information related to allelic and genotypic frequencies in Iranian populations to serve as the base for expanding systems possessing maximum power of discrimination.5 The present research was conducted to satisfy the same need. SNPs genotyping is performed via several methods. High-resolution melting (HRM) is one of the RT polymerase chain reaction (PCR) -based methods that uses differences in melt curves to recognize variations in nucleic acid sequences. The melt curve can be recognized by fluorescent dyes bound to the nucleic acids based on the separation of these dyes during denaturation of double-strand DNA by real-time PCR machine.6,7 Various features such as GC content, segment length, sequence, and heterozygosity cause differences in melt curves, and the results of this data can be used in SNPs genotyping, mutation screening, methylation, and other applied research.6,7 In the SNP genotyping method, nucleotide variations cause differences in melt curves, although these differences are slight in point mutations and SNPs, as these temperature differences are from 0.51 ̊C for the G/T, G/A, C/T and C/A, 0.5-1 ̊C for C/G, and less than 0.2 ̊C for T/A nucleotide differences. In HRM, this temperature difference Evaluation of Informative SNPs in Iranian Azeri Population


Introduction
New DNA analysis techniques have made advancements in human identification and forensic genetics.Using identification methods based on DNA analysis is necessary for identifying human-remains in mass disasters, wars, and sociopolitical events when personal factors or other physical traits are not informative. 1Using short tandem repeat (STR) markers is now a common method for identification in forensic tests.Nevertheless, they cannot be used for DNA that has been considerably degraded because the standard STRs have long repetitive regions and do not yield qualified results for degraded samples.2][3][4] One of them is genotyping mitochondrial DNA using short genetic markers.Although this method has advantages, the sequence diversity of these markers is lower in comparison with the STRs.Also, since inheritance of these markers is maternal, finding some relationships such as those between fathers and daughters is difficult which limits their capability. 2,3Another alternative method is to shorten fragment length by redesigning currently known STR primers that are generally called mini-STRs. 3,4Next common alternative marker in forensic tests is single nucleotide polymorphisms (SNPs) that has advantages over other markers. 4,5However, the prerequisite to using SNP genotyping for forensic purposes is to obtain information related to allelic and genotypic frequencies in Iranian populations to serve as the base for expanding systems possessing maximum power of discrimination. 5The present research was conducted to satisfy the same need.SNPs genotyping is performed via several methods.High-resolution melting (HRM) is one of the RT polymerase chain reaction (PCR) -based methods that uses differences in melt curves to recognize variations in nucleic acid sequences.The melt curve can be recognized by fluorescent dyes bound to the nucleic acids based on the separation of these dyes during denaturation of double-strand DNA by real-time PCR machine. 6,7arious features such as GC content, segment length, sequence, and heterozygosity cause differences in melt curves, and the results of this data can be used in SNPs genotyping, mutation screening, methylation, and other applied research. 6,7In the SNP genotyping method, nucleotide variations cause differences in melt curves, although these differences are slight in point mutations and SNPs, as these temperature differences are from 0.5-1˚C for the G/T, G/A, C/T and C/A, 0.5-1˚C for C/G, and less than 0.2˚C for T/A nucleotide differences.][7][8] The present research intended to study allele frequency and efficiency of the 4 SNPs introduced in the SNPforID database (i.e.Rs1454361, Rs1355366, Rs2107612, Rs2111980).

Materials and Methods
In order to prevent the possible contamination with modern DNA, laboratory equipment's were decontaminated through bleaching with 5% active chlorine.Then, they were washed in sterile distilled water, and finally, all surfaces were exposed to UV irradiation in an ultraviolet chamber for 60 minutes.All buffers and materials were autoclaved and sterilized.All steps were taken in separate places under sterile conditions using sterile gloves and masks.Negative and positive controls were used for all stages of extraction, PCR, real-time PCR, and sequencing.
According to the approximate distribution of ethnic groups in Iran's population, one hundred unrelated individuals that lived in Northwest regions of Iran (40 from East Azerbaijan province, 25 from West Azerbaijan province, 20 from Ardabil province, and 15 from Zanjan province) were selected and 4 mL blood sample was taken from each one.The blood samples were put in tubes containing 0.5M anticoagulant EDTA at pH=8 and transferred to the laboratory.
Rapid genomic DNA extraction (RGDE) method was used to extract DNA from blood samples.By this method, genomic DNA with high quality and quantity can be obtained in the shortest time. 9Then electrophoresis by 1% agarose gel was performed for quality control of the DNA extraction product.Moreover, DNA quantity was controlled using NanoDrop (IMPLEN Company), and samples with low quality and quantity were re-extracted.
The nested PCR method was employed to obtain better and more accurate results.In this method, 2 pairs of primers are used to increase PCR sensitivity (Tables 1 and 2).All Primers were designed using Oligo7 software.The first pair is used to amplify specific segments of the target DNA for 30 cycles (Table 3).Then PCR product was transferred to another tube and used as template, then HRM was performed by the second primer pairs (Table 4) (StepOnePlus™ Real-Time PCR System by Thermo Fisher).Then, the serial dilution of the initial PCR product was prepared, and HRM analysis was carried out for each one.It was found that the 0.001 dilution was the best.In the melt curve stage, the temperature was increased from 60 degrees to 95 degrees at intervals of 0.3 • C, which was suitable for detection of SNPs.
Finally, allele frequencies of the SNPs were obtained by HRM analysis on DNA samples taken from 100 different individuals.At the end of the HRM analysis, sequencing was performed by genetic analyzer machine (Applied Biosystems 3130XL) for 3 samples of each peak to confirm the final results.The sequencing results were analyzed using the DNA Baser and Gene Runner software and generalized to the other samples.After genotyping samples from 100 individuals, results for each of the 4 SNPs were statistically studied using the PowerStats version 12 software.

Ethical Considerations
All procedures performed in studies involving human participants accorded with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki Declaration and its later amendments or comparable ethical standards (Ethics Code for Research:

Results
After DNA extraction, results were assayed with electrophoresis (Figure 1).Electrophoresis showed that DNA extraction has been successful.Following the HRM analysis for each SNP, diagrams indicated segregation and identification of the SNPs for each individual.Each SNP is represented by a specific color (Figures 2-4).Difference plot showed that the HRM has been done successfully.Allele frequency for Rs2107612 was G: 0.389, A: 0.611; for Rs1454361 was A: 0.789, T: 0.211; for Rs1355366 was T: 0.505, C:495; and for Rs2111980 was G: 0.921, and A: 0.079.
Then data was entered into the PowerStats version 12 software and various indicators listed in the following table (Table 5).Based on the results summarized in Table 5, minimum MP was determined for Rs2107612, and also,      maximum of PD was determined for Rs2107612 which indicates this SNP has greater power of discrimination for 2 unrelated individuals in the studied population.

Discussion
1][12][13] Identification of autosomal and mtDNA SNPs in different ethnic groups in Iran can show the relationship between Iranian ethnicities and other ethnicities, and could be informative in determining the pattern of human migration. 13,184][15][16][17][18][19][20] For the present research 4 SNPs were selected from the SNPforID database.They had 50:50 allelic frequencies in various populations so it could repeat in other population.Allele frequencies of the SNPs were obtained by using HRM analysis from 100 different individuals that lived in Northwest of Iran (East Azerbaijan province, West Azerbaijan province, Ardabil province, and Zanjan province).According to our findings, only 2 polymorphisms (Rs2107612 and Rs1355366) had equal allelic frequencies with heterozygosity more than 50%.Accordingly, these polymorphisms are very informative and can be used to identify Iranian Azeri population.
Also, the minimum MP (matching probability) was determined for Rs2107612, and the maximum PD (Power of disclusion) for Rs2107612; which indicates this SNP has greater power of discrimination for 2 unrelated individuals in the studied population.
Furthermore, comparison of allelic frequencies of all 4 SNPs in the currently studied population with other studied populations base of SNPforID database showed that allelic frequencies of these 4 SNPs were very similar to those of the Persian speaking populations in Iran (Figures 5-8).This finding is in line with the previous studies that showed the genetic distances between Iranians and populations in Central Asia, East Asia, and Southeast Asia were higher than those between Iranians and populations in West Asia such as Turkey and the Caucasus. 13,18

Figure 3 .
Figure 3. Derivative Melt Curves Obtained by HRM Analysis.Each peak represents an SNP.

Figure 4 .
Figure 4. Aligned Melt Curves Obtained by HRM Analysis.Each peak represents an SNP.

Figure 2 .
Figure 2. Difference Plot Obtained by HRM Analysis.Each peak represents an SNP.
Probability): This index indicates identical likely this SNP for 2 unrelated individuals in the studied population.PD (power of disclusion): This index indicates the power of disclusion for 2 unrelated individuals in the studied population by this SNP.PE (power of exclusion): Probability of excluding relatives of the true father from paternity.TPI (Typical Paternity Index): Is a calculated value generated for a single genetic marker or locus and is associated with the statistical strength or weight of that locus in favor of or against parentage.Ho (homozygosity): possessing 2 identical forms of a particular gene, one inherited from each parent.He (heterozygosity): possessing 2 different forms of a particular gene, one inherited from each parent.A 1 Frq (Allele 1 Frequency): Is the relative frequency of allele 1 at a particular locus in the studied population.A 2 Frq (Allele 2 Frequency): Is the relative frequency of allele 2 at a particular locus in the studied population.

Table 1 .
Primers Used in the Primitive PCR and the Characteristics of Proliferated Fragments Containing SNP Marker

Table 2 .
Primers Used in HRM and the Characteristics of Proliferated Fragments Containing SNP Marker IR.BMSU.RBC.1396.177)

Table 3 .
Primary PCR was done using Taq DNA Polymerase 2x Master Mix (Ampliqon) with the following conditions

Table 4 .
HRM was performed using StepOne Plus machine (ABI) and 5x Hot FIREPOL Evagreen HRM mix-Rox (Solis BioDyne) with the following conditions • C 15 seconds Figure 1.Agarose Gel Electrophoresis for 7 Extracted DNA With Negative and Positive Control.

Table 5 .
Various Indexes Based on PowerStats