TE Insertion Merging and SVLEN

Hi,

Thank you for providing this useful tool. I have used xTea to identify TE insertions in a large cohort and have a few questions regarding quality control and population-level merging.

I am using your `x_vcf_merger.py` script to combine nearby insertions together at the  population level because, without merging, I end up with  a very large number of rare insertions. I am only merging TEs with the same family (ex: only merge Alus with Alus, not Alus with LINEs).
Upon looking into what got merged together I noticed it combines TEs regardless of the reported SVLEN, which can vary greatly. For example, LINE1s should be around 6kb, but some of the LINEs that get merged into a single insertion can have a SVLEN from ~100bp to ~6kb.

I have also noticed that the `x_vcf_merger.py` script reports the most common SVLEN (correct me if I'm wrong), which is often shorter than the expected SVLEN for the TE family. 

I had a few questions regarding this:

1. Would you recommend applying additional filtering based on SVLEN before running the merge script?
2. Should filters be applied based on subclass ("two_sides_tprt_both","one_half_side",etc) before merging? If so, where might I find the definition of each of these? From what I understand "two_sides_tprt_both" is the most reliable, but these aren't always the most common. 
3. When I report SVLEN in my downstream analysis, should I use the SVLEN chosen by the `x_vcf_merger.py` script (even though it can be very small at times, even zero)? Or should I use the maximum SVLEN of the ones merged (which seems to be closer to the expected length based on TE family).

Thank you for  your help and for developing xTea. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TE Insertion Merging and SVLEN #157

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

TE Insertion Merging and SVLEN #157

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions