Identifying interacting genes of enhancers is critically important for understanding disease-related mis-regulation of the genome. We are creating a benchmark dataset to aid in comparing methods for prediction of interacting genes. This benchmark dataset incorporates ChIA-PET data from the Snyder lab (interactinging RAD21 in GM12878); eQTLs in lymphoblastoid cells curated by the Kellis Lab in HaploReg (also included LD SNPs r2 > 0.8); and Hi-C (high resolution) loops in GM12878 from Aiden lab (Rao, …, Aiden, 2014, Cell). To define a negative set for each enhancer-like region with at least one positive link, we select all genes that 1) are within 500Kb of the region and 2) are not linked in any individual dataset (i.e. we exclude enhancer-gene pairs with evidence from only one datatype).
Tissue of origin Cell Type Biosample
blood immortalized cell line GM12878