I don't know how to do it in Excel -- at least not quickly -- but if you save both lists as text files containing one gene name per line, you can compare the two lists using Perl:
CODE
#!/usr/bin/perl -w
use strict;
# Union: all elements in both lists excluding duplicates
# Intersection: items that are in both lists
# Difference (symmetric): members of FIRST or SECOND but not of FIRST and SECOND
# Difference (simple): members of FIRST but not SECOND
open (FIRST, "gene_list_100.txt") or die;
open (SECOND, "gene_list_2000.txt") or die;
my @first = (<FIRST>);
chomp (@first);
my @second = (<SECOND>);
chomp (@second);
my @union = my @isect = my @sym_diff = ();
my %union = my %isect = my %count = ();
foreach my $e (@first, @second) {
$count{$e}++;
}
foreach my $e (keys %count) {
push(@union, $e);
if ($count{$e} == 2) {
push @isect, $e;
} else {
push @sym_diff, $e;
}
}
my %seen;
my @first_only;
@seen{@second} = ();
foreach my $item (@first) {
push (@first_only, $item) unless exists $seen{$item};
}
@union = sort (@union);
@isect = sort (@isect);
@sym_diff = sort (@sym_diff);
@first_only = sort (@first_only);
print "Union: " . scalar(@union) . " " . join (" ", @union) . "\n";
print "Intersection: " . scalar(@isect) . " " . join (" ", @isect) . "\n";
print "Symmetrical Diff: " . scalar(@sym_diff) . " " . join (" ", @sym_diff) . "\n";
print "Simple Diff: " . scalar(@first_only) . " " . join (" ", @first_only) . "\n";
I guess in Excel you could have both gene lists in a single column, and then sort the column by name. Next to the column, you could enter a formula like:
CODE
=IF(A1=A2,"present"," ")
Drag the formula so that it covers the whole column, and count the number of times "present" appears. You could keep track of the two gene lists by changing the font color of the shorter one before sorting them on name.
Perl seems a better solution, though, unless someone better at Excel than me (read: almost anyone) has a better way...