I'm trying to come up with a computational model of CpG site methylation and the subsequent effect of this methylation on the expression of a gene of interest. However, given that there can be thousands of methylated sites in a chromosome and it is virtually infeasible to look at all the sites individually to determine which sites have an effect on the gene's expression. What would you say would be the best approach to go about this problem? Would it be better to look at known sites like enhancer sites that are known to affect the gene of interest? But this defeats the purpose of the model which is to perhaps find new CpG that affect a gene's expression profile.
Opinions on this topic is greatly appreciated.