Extract amino acid position from Excel file downloaded from gnomAD export

We can simply use the following formulae to extract the number from string in Excel file

gnomAD-extract-pos

=SUMPRODUCT(MID(0&E4, LARGE(INDEX(ISNUMBER(--MID(E4, ROW(INDIRECT("1:"&LEN(E4))), 1)) * ROW(INDIRECT("1:"&LEN(E4))), 0), ROW(INDIRECT("1:"&LEN(E4))))+1, 1) * 10^ROW(INDIRECT("1:"&LEN(E4)))/10)

Note: Tested in Excel 2010

How to liftOver VCF file by batch?

We can use the tool picard to liftOver VCF file by batch

URL:
http://broadinstitute.github.io/picard/command-line-overview.html#LiftoverVcf

picard-liftover

create dictionary (optional)
if you do not have the dict of reference fasta, you need to create first

java -jar /software/picard-tools-2.8.1/picard.jar CreateSequenceDictionary REFERENCE=/software/GATK_files/hg38/hg38.fa OUTPUT=/software/GATK_files/hg38/hg38.dict

download the chain file
Go the http://hgdownload.cse.ucsc.edu/downloads.html#human and download from liftOver hyperlink

liftOver
java -jar /software/picard-tools-2.8.1/picard.jar LiftoverVcf I=input.vcf O=lifted_over.vcf CHAIN=/software/liftOver/hg19ToHg38.over.chain.gz REJECT=rejected_variants.vcf R=/software/GATK_files/hg38/hg38.fa

Reference:
Chain format
URL: https://genome.ucsc.edu/goldenpath/help/chain.html

How to liftOver VCF file by web tool?

We can visit the web tool at https://genome.ucsc.edu/cgi-bin/hgLiftOver

leftOver

Select the Original Genome, Original Assembly, New Genome and New Assembly, say
Original Genome: Human
Original Assembly: Dec. 2013 (GRCh38/hg38)
New Genome: Human
New Assembly: Feb. 2009 (GRCh37/hg19)

Paste the data and click “Submit” button as the RHS of the text area

The result file (i.e. hglft_genome_260a_37bd80.bed) will be ready to download after some time
leftOver-2