What's the difference between count and percentile rank in Affymetrix datase - (Nov/09/2012 )
I'm going through the GEO profiles for my gene of interest. I noticed that for each data set, there is a value indicated by a red bar, and a rank indicated by a blue dot. When doing comparison, say between control and drug treated, what should I look at? I quite often see a drastic change in value (the red bar) but not so much with the rank (blue dot). Does that mean there is no alteration of this gene?
Red column: Each column represents the expression measurement extracted from the VALUE column of one original submitter-supplied Sample record. The original Sample accessions (GSMxxx) are listed in the gray boxes along the bottom of the chart. Sample records are submitted by the scientific community and reflect a wide variety of data types that are processed and normalized using a wide variety of methods. There is no standard unit for gene expression and so expression values should be considered arbitrary units. It can be assumed that the value measurements within a GEO DataSet have been calculated in an equivalent manner, but it is not usually appropriate to make direct comparisons of values between different DataSets. Single channel samples are normalized signal count values, whereas dual channel samples are typically test/reference log ratios. Check the 'Data processing' field or VALUE description in original Sample records for information on how the VALUEs were calculated.
Blue square: Represents rank order of expression measurements. All VALUEs within a Sample are rank ordered, and then placed into percentile 'bins'. In other words, all the values of one hybridization are sorted, then split into 100 groups. Thus, the blue rank squares on charts give an indication of where the expression of that gene falls with respect to all other genes on that array.
Thanks to this clear explanation!