Skip to content

jk2236/RM_WGS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Record-matching of STR profiles with fragmentary genomic SNP data

Description

This repository contains tools for performing record matching of STR profiles with fragmentary genomic SNP profiles, as described in Kim and Rosenberg. The record matching pipeline starts with a set of reference files containing SNP-STR genotypes on reference individuals, a set of files containing STR profiles only on test individuals, a set of files containing SNP profiles only on test individuals, and a set of genetic map files. It outputs a match-score matrix for all STR-SNP pairs, with rows indicating STR profiles and columns indicating SNP profiles. It enables hypothesis tests for a variety of relatedness hypotheses based on SNP and STR profiles. For a demonstration of the record matching pipeline, please see Example. The repository also includes scripts to generate simulated pedigrees and fragmentary genomic SNP data.

Dependencies

Dataset

  • A phased reference SNP-STR haplotype panel of Saini et al. from the 1000 Genomes Project phase 3 containing 2,504 individuals can be downloaded from here. Processed data containing 1-Mb SNP windows extending 500 kb in each direction from each CODIS locus midpoint can be found under data/1KGP.
  • HGDP SNP-STR data containing 872 individuals can be downloaded from here.
  • Human genetic maps. HapMap GrCh36 and GrCh37 genetic maps in PLINK format. Can be downloaded from BEAGLE page.
  • All genotypes in the reference panel must be non-missing and phased for BEAGLE imputation.

Reference

Kim J, Rosenberg NA (2022). Record-matching of STR profiles with fragmentary genomic SNP data. bioRxiv, 2022.09.01.505545. 10.1101/2022.09.01.505545.

Kim J, Edge MD, Algee-Hewitt BFB, Li JZ, Rosenberg NA (2018). Statistical detection of relatives typed with disjoint forensic and biomedical loci. Cell, 175(3):848-858.e6. 10.1016/j.cell.2018.09.008.

Edge MD, Algee-Hewitt BFB, Pemberton TJ, Li JA, Rosenberg NA (2017). Linkage disequilibrium matches forensic genetic records to disjoint genomic marker sets. PNAS, 114(22):5671-5676. 10.1073/pnas.1619944114.

About

No description, website, or topics provided.

Resources

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages