We are uploadind the mapped read (*.bam) and genotypic files (vcf) used in the publication: Leveraging mutational burden for complex trait prediction in sorghum (Valluru et al. 2019) Here, we sequenced 239 biomass Sorghum lines using the Illumina HiSeq 4000 (2x150). We used the Sentieon tools to call variants. Briefly, fastq files were aligned to the Sorghum bicolor reference genome version 3.1 (https://phytozome.jgi.doe.gov). PCR duplicates were removed, base quality was recalibrated based on a ‘high confidence SNPs’, and recalibrated files were processed through the Haplotype Caller (HC). The dataset contains 239 samples, corresponding to 229 unique accessions.