A flaw in the most common method of population analysis raises questions about population information

Massive and highly complex datasets, referred to as the “big data revolution,” have enabled genetic research and other population studies to continue. To effectively manage data, scientists use statistical approaches that compress and simplify this data without losing most vital information. The most commonly used method for analyzing population genetic data is principal component analysis (PCA). PCA is the primary type of investigative analysis and data description in the majority of population genetic analyses. It has a wide range of applications in genetic biology. According to new findings, researchers show that PCA is deeply flawed, casting doubt on many genetic test results. With PCA applied in thousands of research papers, current findings claim that the results of such research may be incorrect.

Read also : Australian study shows possible link between abnormal proteins and all forms of motor neuron disease

Assessing the accuracy of PCA clustering. Credit: Scientific Reports

PCA gap may affect genetic science

Principal component analysis is the most common method for managing genetic data. PCA analysis has given ethnic and genetic relationships to many studies. PCA can also analyze medical genetics and ancestry testing. Scientists have shown that this widely applied method is flawed, causing a potential problem in genetic science.

One of the uses of PCA is to examine the population structure of a person or group to determine their ancestry. It can also analyze demographic history, infer kinship, and identify ancestral origin in the data. These very important elements of population genetics require a reliable system. The current study shows discrepancies in the use of PCA. Due to the prevalence of PCA, it is believed to give correct results. However, a new study has shown that the method is unreliable and does not produce excellent statistical conclusions about the data. The researchers focused on the use of PCA in population genetics. They discovered that unknown data could appear similar to any population by changing the number and types of reference samples. The analysis method generates endless historical versions; although all of them are mathematically correct, only one can be biologically correct. The research also examined the flexibility of PCA. The flexibility of PCR shows the lack of confidence in the method as any change in the reference or test samples will lead to a different result. A large number of studies have analyzed population studies using PCA. The researchers of this new study argue that these results may not be entirely accurate. They propose alternative methods of analysis for population genetics to make the science more reliable.

Read also : New method using ‘soft’ CRISPR may provide new solution to genetic diseases

Clinical significance

Understanding ancestry and evolutionary trends is essential in population genetics. With some medical conditions, investigation may require digging into the ancestral line and developing a more effective treatment. With the current study showing the limitations of PCA, scientists can develop new and better methods of analyzing demographic data to address population concerns for clinical purposes.

Conclusion

People are curious about their ancestry and lineage. As science seeks to promote and offer solutions to various concerns, it is essential that it takes the right approach to seeking information. Following the results of this ongoing study, the limitations of PCA have now become apparent. Scientists can now aim to develop new and better methods for genetic science. The need to trace ancestry and other population information can be done without fear of getting false results.

Read also : French researchers develop the first artificial intelligence capable of creating sequences of human genomes

References

The results of principal component analyzes (PCA) in population genetic studies are highly biased and need to be re-evaluated