Data from ~15k pages from dmoz.org (therefore mostly HTML):

Most Common Headers

content-type15348
date15315
server14757
content-length10243
accept-ranges8873
last-modified8769
etag8012
x-powered-by5491
set-cookie5105
transfer-encoding4614
cache-control3900
connection3900
expires2264
content-location1547
vary1406
pragma1345
p3p1000
microsoftofficewebserver739
x-host420
x-inkt-uri420
x-inkt-site420
x-cache412
x-aspnet-version387
x-pad358
content-language274
x-cache-lookup237
x-server-ip206
via192
age111
page-completion-status68
x-pingback64
filter-revision63
iisexport55
keep-alive49
tcn36
nncoection33
mime-version30
status19
pics-label16
x-creationtime16
composed-by15
proxy-connection14
www-authenticate14
window-target11
microsoftsharepointteamservices11
limerick10
cneonction10
x-server9
srv9
ms-author-via8
x-spip-cache8
x-back7
x-hostname7
x-servername7
content6
x-vortech-php6
x-server-name6
cluster-server5
x-dip5
refresh5
x-gu-httpd5
x-accelerated-by5
commerce-server-software5
x-cocoon-version5
public-extension4
x-perlbal4
x-mod-choke4
x-dmuser4
resourcetag4
x-track4
application-name4
x-varnish4
x-tpsrv4
x-responding-server4
content-transfer-encoding3
x-webserver3
set-cookie23
x-ud-loopcounter3
x-ud-target3
x-ud-host3
x-ud-method3
from3
x-ud-remote_addr3
servername3
cache-expires3
x-handling3
test-test3
x-hacker3
allow3
content_length2
x-highwire-sessionid2
x-ips-cache2
x-guru2
cookieheader2
x-cached-time2
dcgi-server2
x-via2
http_host_was2
host2
cache-last-checked2
proxy-agent2
remote_addr_was2
content-script-type2
server_addr_was2
x-n2
x-content-parsed-by2
warning2
realserver2
x-kiosk_expires2
servlet-engine2
server_name_was2
x-cache-ttl2
x-exectime2
x-kexpinterval2
uri2
ppserver1
1
x-gahelper1
srvname1
copyright1
cashe-control1
portal-engine1
01
wn_vars1
x-bender1
x-cache-hit1
cachecontrol1
x-atg-version1
x-ibs-ccds-origin1
req-id1
gamesystem.com1
served-by1
x-caching-rule-id1
message-id1
charset1
x-rehbein1
x-lrecl1
mirror1
e-tag1
version1
matrix-server1
slash_log_data1
created-by1
x-accelerator-vary1
bim server1
x-sfx-revision1
x-instance-name1
wn1
content-md51
web1
x-dave1
emap-type1
content-encoding1
machinename1
isp1
x-machine-id1
x-weblogic-cluster-list1
x-coremedia-cacheable1
ihs1
x-site1
x-zend-winenabler1
delegate-ver1
policyref1
x-probase-spider-mirrorof1
compression-control1
character set '#33' is not a compiled character set and is not specified in the 'c1
slb_server1
x-cms-powered-by1
x-content-type-warning1
phone1
x-cachettl1
if-modified-since1
hosted-with1
x-recfm1
x-storesense1
x-ibs-ccds-version1
imagetoolbar1
x-ga-server-instance1
communityserver1
content location1
x-vary1
file 'c1
x-fb-host1
x-ruby-cluster-id1
x-sfx-webhead1
x-aipby-by1
x-mrhost1
author1
hostname1
x-cache-rules-applied1
pagebuildtime1
x-xrds-location1
cbsid1
x-glyphgate1
x-r4l-vhost1
x-id1
south australian government1
x-header-set-id1
source1
ignore__length1
if-none-match1
sn1
modlayout1
content-style-type1

Most Common Content-Type Values

text/html12419
text/html; charset=ISO-8859-1671
text/html; charset=UTF-8629
text/html; charset=utf-8628
text/html; charset=iso-8859-1373
text/html;charset=ISO-8859-1104
text/html; charset=windows-125186
text/html;charset=iso-8859-162
text/html;charset=utf-842
text/html;charset=UTF-839
text/html;26
application/pdf20
text/html; charset=US-ASCII19
text/html; charset=iso-8859-217
text/html; charset=ISO-8859-214
text/html; charset=WINDOWS-125111
text/html; charset=koi8-r9
text/html; charset=ISO-8859-158
image/jpeg7
text/html; Charset=UTF-87
text/html; charset=windows-1252;6
text/html; charset=6
text/html; charset=iso-8859-155
text/html; charset=EUC-JP5
text/html; charset=iso-8859-94
text/xml4
application/xml4
text/html; charset=shift_jis4
text/plain4
text/html; Charset=utf-84
text/html;ISO-8859-1; charset=UTF-84
text/html; charset=utf-8;4
text/html; charset=Big54
text/html; charset=latin13
text/html; charset=none3
text/html; Charset=ISO-8859-13
text/html; Charset=windows-12553
text/html; charset=gb23123
text/html;charset=windows-12513
text/html; Charset=iso-8859-13
text/html; charset=windows-12523
text/html; charset=windows-12503
text/html; Charset=windows-12522
text/html; charset=WINDOWS-12572
text/html; charset=Shift_JIS2
text/html; charset=ISO-8859-92
text/html; charset=cp12512
text/html; charset=Windows-12522
text/html;charset=iso-8859-152
text/html; charset="UTF-8"2
text/html; charset=ISO-8859-1;2
text/html; charset=windows-12562
text/html; charset=us-ascii2
text/html; charset=GB23121
text/html; charset=latin-11
text/html;charset=euc-kr1
text/html; charset=utf81
text/html; charset=big51
text/html; charset=shift_js1
text/html; charset=zh-tw1
text/html; charset=iso-8859-1;1
text/html; charset=MS9491
text/html;charset=WINDOWS-12521
text/html;charset = utf-81
text/html; charset=BS_47301
text/html; charset= windows-12521
application/atom+xml; charset=UTF-81
text/html; qs=.91
application/x-shockwave-flash1
text/html; charset=ISO-8859-71
text/html; charset=ISO-8859-41
text/html; charset="ISO-8859-1"1
text/html; charset=iso8859-11
text/html; charset=gbk1
text/html; Charset=ISO-8859-71
text/html; charset=NONE1
text/html; charset=TIS-6201
text/html;charset=euc-jp1
text/html;charset=iso-8859-21
text/html;Charset=ISO-8859-11
text/html; charset=es_ES.ISO-8859-11
text/html; charset=ISO1
text/html; charset=Windows-12511
text/html; charset=EUC-KR1
text/html;charset=EUC-KR1
text/html; charset=win-12511
text/html; charset=en_US1
text/html; charset=no1
text/html; charset=None1
text/html; charset=CP12501
text/HTML; Charset=Windows-12511
text/html; charset=euc-jp1
text/html; charset=iso.8859-11
text/html;charset=ISO-8859-151
text/xml;charset=utf-81
text/html; charset=big5,euc-jp1
text/html;charset=Cp12521
text/html; charset=WINDOWNS-12521
text/html; charset=windows-12571
text/html; charset=windows-12541
text/html; charset=windows-12551
application/rss+xml1
text/xml; charset=ISO-8859-11

Most Comment Date Formats

These come from all the headers allowed in responses that have a HTTP-date value, not just the "Date" header.

Valid (RFC 822)26154
-167
040
Fri, Jan 1 1999 00:00:00 GMT4
Wed Dec 12 00:22:55 2007 GMT3
Wed 12 Dec 2007 00:16:47 GMT2
Valid (asctime)2
1
Fri, Apr 01 1974 00:00:00 GMT1
12,12 12 07 12:24:10 GMT1
Thu May 11 20:47:17 2000 GMT1
Fri Nov 16 03:58:06 2001 GMT1
Sat Feb 8 12:37:23 2003 GMT1
Thu Feb 22 19:38:39 2001 GMT1
{ts '2007-12-11 19:26:11'}1
Thu Feb 19 00:59:40 2004 GMT1
Tue, 11 DEC 2007 23:21:58 GMT1
Wed, 12 Dec 2007 00:25:23GMT1
Mon May 13 10:02:36 2002 GMT1
Thu Feb 24 01:16:44 2000 GMT1
12.12.20071
Mon May 14 04:32:21 2001 GMT1
Sat May 12 23:42:27 2001 GMT1
1.28419E+171
Wed May 3 22:28:20 2000 GMT1
Thu Sep 11 10:01:14 2003 GMT1
Tue Jul 30 18:40:30 2002 GMT1
Thu Jun 15 12:34:19 2000 GMT1
Wed Jun 27 12:54:33 2001 GMT1
Sat Nov 9 05:19:16 2002 GMT1
Sat Nov 25 05:02:40 2000 GMT1
Sun Jun 18 19:10:46 2000 GMT1
Wed, 12 Dec 2007 00:25:41GMT1
Wed, 12-Dec-2007 24:25:18 GMT1
Sun Feb 26 22:29:31 2006 GMT1
Tue Aug 2 16:32:29 2005 GMT1
Tue Jun 3 06:04:04 2003 GMT1
Sat Apr 24 21:07:50 2004 GMT1
Sun Jul 13 11:48:28 2003 GMT1
Tue May 11 14:30:24 2004 GMT1
Thu Dec 14 06:49:36 2000 GMT1
Tue, 11 Dec 2007 17:29:58 +0300 GMT1
Mon May 29 17:26:47 2000 GMT1
Sunday 15-May-1994 12:00:00 GMT1
Fri Oct 19 15:51:59 2007 GMT1
Wed, 12-Dec-2007 24:24:43 GMT1
Wed, 12 Dec 2007 00:25:07GMT1
Fri Feb 22 22:42:43 2002 GMT1
Thu Jun 7 03:08:51 2007 GMT1
Mon Sep 10 12:29:30 2007 GMT1
Wed Apr 7 23:36:25 2004 GMT1
Wed Nov 7 22:40:16 2007 GMT1
Thu Jun 22 02:22:23 2000 GMT1
Sun Jul 24 17:55:17 2005 GMT1
Tue Oct 16 19:42:47 2007 GMT1
{ts '2007-12-11 18:25:47'}1
Wed, 12-Dec-2007 24:22:48 GMT1
Sat Sep 1 22:42:58 2007 GMT1
Wed Jul 18 09:45:52 2007 GMT1
Sun Mar 3 23:52:59 2002 GMT1
Fri Nov 30 13:35:39 2007 GMT1
Tue Dec 11 16:37:50 2007 GMT1
-1;1
Tue Dec 11 01:45:16 2007 GMT1
Wed Aug 30 22:58:09 2006 GMT1
Mon Jun 25 18:41:19 2007 GMT1
Sun Dec 22 12:40:14 2002 GMT1
Fri Oct 12 23:37:18 2001 GMT1
{ts '2007-12-11 19:26:30'}1
Wed Feb 20 18:26:49 2002 GMT1
Sun Apr 28 05:26:09 2002 GMT1
12.12.2007 2:25:361
Fri Nov 16 01:11:08 2007 GMT1
Fri, Jun 12 1981 08:20:00 GMT1
Sat Nov 4 09:50:05 2000 GMT1
Wed, 12-Dec-2007 00:30:14 GMT1
Mi, 12 Dez 2007 00:25:34 GMT1
Wed, 12-Dec-2007 00:26:09 GMT1
Wed, 12-Dec-2007 24:25:03 GMT1
Wed Sep 5 14:09:38 2001 GMT1
Thu Jul 4 09:12:19 2002 GMT1
Tue Dec 23 13:03:16 2003 GMT1
{ts '2007-12-12 01:24:44'}1

Most Common First Server Product Token

Apache3115
Microsoft-IIS/6.02563
Apache/1.3.371075
Apache/1.3.331059
Microsoft-IIS/5.0866
Apache/2.0.52508
Apache/1.3.39465
Apache/2.0.54407
Apache/2.2.3371
Apache/1.3.34325
Apache/1.3.27245
Apache/1.3.26242
Apache/2.2.4223
Squeegit/1.2.5206
Apache/2.0.46189
Apache/2.0.59176
Apache/2.2.6156
Apache/1.3.29126
Apache/2.0.55124
Apache/2.0.53100
Apache/1.3.3191
Apache/1.3.3679
Apache/2.0.5169
Apache/ProXad67
Rapidsite/Apa/1.3.3357
Apache/2.0.6155
Apache/255
Zeus/4.354
Apache/1.3.2853
Apache/2.0.5050
Apache/2.049
Sun-ONE-Web-Server/6.148
Apache/2.2.247
Apache/2.0.4046
Apache/1.346
Apache/1.3.3544
Microsoft-IIS/4.043
Netscape-Enterprise/4.143
Apache/2.0.4942
Apache/1.3.2041
Apache/2.2.040
NOYB40
Apache/1.3.1940
Server38
Apache-Coyote/1.134
GFE/1.332
Apache/2.0.4830
28
Lotus-Domino23
Zope/(Zope22
IdeaWebServer/v0.5021
Netscape-Enterprise/3.620
Apache/1.3.2220
Apache/2.0.5819
Zeus/4.219
Apache/1.3.1218
WebServerX18
ConcentricHost-Ashurbanipal/2.017
ZX_Spectrum/199715
Apache/1.3.613
Zeus/3.312
Netscape-Enterprise/6.012
Apache/1.3.911
tigershark/3.0.12810
Jetty/5.1.109
Apache/2.0.479
IBM_HTTP_Server9
Apache/1.3.239
IBM_HTTP_Server/6.1.0.59
thttpd-TOF8
.V018
Mittwald8
httpd8
A8
.V067
Rapidsite/Apa7
Apache/2.0.59easyTECC/2.07
Oracle7
Apache/1.3.247
Apache-AdvancedExtranetServer/1.3.337
LiteSpeed6
Zeus/3.46
Rapidsite/Apa/1.3.316
.V056
Simple6
AOLserver/3.5.106
WebServer6
nginx/0.5.326
IBM_HTTP_Server/6.0.2.156
Zeus/4.16
.V035
.V125
nginx/0.5.305
thttpd5
Apache/1.3.35
Apache/2.0.395
AppleDotMacServer-1B56084
.V144
e/34
.V134
.V174
lighttpd/1.4.184
nginx/0.3.304
Apache/1.3.144
BaseHTTP/0.34
WebSite/3.5.194
nginx/0.5.334
DinaHTTPd4
Unknown3
WebSTAR/4.4(SSL)3
.V043
Acenet3
.V023
thttpd/2.213
Ready3
.V193
.V183
Apache/2.0.433
Mongrel3
Oracle_Web_Listener/4.0.8.1.0EnterpriseEdition3
lighttpd/1.4.133
Zeus/4_33
.V203
nginx/0.5.73
Sun-Java-System-Web-Server/7.03
ENIAC3
Resin/3.0.233
lighttpd3
Stronghold/2.23
WebLogic3
ZWS3
AOLserver/4.0.103
Oversee3
hi-ho3
ConcentricHost-Ashurbanipal/1.82
Power2
Oracle-Application-Server-10g2
Roxen/4.5.146-release32
Blk-Enterprise2
Oxito2
Concealed2
Apache/2.22
IBM_HTTP_Server/6.0.2.212
httpd+ssl2
WebSTAR/3.0.22
IBM_HTTP_Server/2.0.472
HP2
SmallWebServer/2.02
Wake2
lighttpd/1.4.152
lighttpd/1.4.162
Apache/2.0.61/DataZone2
Servage.net2
Apache/2.0.52-CHS-12
WisePanel2
Apache/1.3.172
webserver2
nginx/0.5.272
nginx/0.4.132
nginx/0.3.352
UserLand2
*2
Apache-AdvancedExtranetServer/2.0.532
Apache/2.0.59-CHS-12
Lotus-Domino/02
Jetty/5.1.12
MIT2
Resin/3.0.212
Resin/3.0.192
Web2
INT.PL/3.0.9f2
Stronghold/2.4.22
ZealdWeb/1.02
.V102
IBM_HTTP_Server/6.0.12
Apache/1.2.62
"USHR2
OSU/3.10a;Multinet2
.V092
Spry-SafetyWEB-Server-NT/1.3a2
Oracle-Application-Server-10g/10.1.2.0.02
WebSitePro/2.4.91
IcodiaSecureHttpd1
Demandware1
HTTPd-WASD/6.0.21
WebSTAR1
thttpd/2.25b1
Lasso/7.01
Embperl/2.0rc11
EarthServer/1.01
Apache/2.2.51
Lasso/6.01
Apache/1.0.01
4D_WebSTAR_S/5.3.31
aris1
Roxen/BuGless1
Roxen/4.0.325-Debian1.1-release41
Apache/1.3.01
Oracle-Application-Server-10g/9.0.4.2.01
Apache/Not1
Apache/2.2.x1
Apache/Unoeuro1
lighttpd/1.5.01
Apache/1.3.111
nginx/0.5.141
zope.server.http1
IPL1
Conundrum1
Resin/2.1.161
Resin/2.1.171
4D_WebStar_D/6.741
FSSWEB/1.01
Oracle9iAS/9.0.21
OpenCms/6.0.01
AAISP/1.11
HTTP1
Sun1
Apache/X.X.X1
NetWare-Enterprise-Web-Server/5.11
.V111
IBM_HTTP_Server/1.3.12.61
Embperl/2.0b91
Microsoft-IIS/3.01
Zeus1
lighttpd/1.4.111
nginx/0.5.171
Sarge1
Pacohost.com1
webfusion.co.uk/httpd1
WebSite/3.1.111
nginx/0.5.61
IBM_HTTP_SERVER1
LexisNexis1
Aleto1
workSpace1
Redirector1
AOLserver/3.4.21
VHFFS1
Resin/3.1.11
"IIS/1.01
Apache-AdvancedExtranetServer/1.3.291
ZAMAN1
Apache-AdvancedExtranetServer/2.0.471
Embperl/2.0rc31
gws1
nginx/0.5.261
httpd/ktkb1
CERN/3.0A1
Hanbiro1
IBM_HTTP_SERVER/1.3.28.1-PK278751
EnterpriseWeb/1.1.41
SAMBAR1
IBM_HTTP_Server/21
IIS1
Resin/3.1.21
IBM_HTTP_SERVER/1.3.26.21
sunrise1
MCWeb1
Oracle9iAS/9.0.2.3.01
Lasso/81
Apache/1.3.34+APPLETZ/1.01
WebSTAR/4.5(SSL)1
nginx/0.5.311
nginx/0.6.61
"Web1
"SMSNET1
WebSTAR/4.2(SSL)1
nginx/0.6.171
OSU/3.9a;UCX1
nginx/0.6.111
AOLserver/4.0.31
6-lighttpd1
Caudium/1.4.81
IBM_HTTP_SERVER/1.3.19.11
Apache/2.0.441
Orion/2.0.71
Orion/2.0.51
Oracle-Application-Server-10g/9.0.4.0.01
SAGA1
NaviServer/2.01
Apache-AdvancedExtranetServer/1.3.231
Apache-AdvancedExtranetServer/1.3.261
WebSTAR/4.21
noViA1
Apache/1.3.41
WLW1
AtyponWS/7.11
YTS/1.15.11
Undisclosed-Webserver/0.11
Apache-AdvancedExtranetServer/2.0.501
IIS/7.01
publicfile1
WebSitePro/2.3.71
Discworld/0.10.31
Sun-Java-System-Web-Server/6.11
httpd/kbn1
Ski1
Stronghold/4.01
4D_WebSTAR_S/5.3.01
nginx/0.1.451
CoolWeb1
SAP1
IceWarp/8.31
LoboCom-WebServer/1.01
Apache/2.0.361
az-webserver1
Apache/ASJ2007070300SUX-FATE1
WEB602/1.041
nix1
Apache-AdvancedExtranetServer1
ONI1
web500gw1
Serverplan1
WebTV/1.01
Apache-AdvancedExtranetServer/1.3.311
The1
Apache/10.01
OF.PL1
nginx/0.6.161
Reed1
<a1
Apache-AdvancedExtranetServer/2.0.481
INSIDE1
THEO+Server/5.01
Roxen/2.1.1351
Zeus/4.01
IBM_HTTP_Server/6.01
Apache/1.3.321
nginx/0.3.251

Etag Strength and Validity

Valid (strong)7962
Valid (weak)17
8
eaccelerator-13516927601
5324bfe837728477f89cfb7d983820ed1
PUB11974191601
9a2289b94c8e112ffafb9b55947b48e1
z-1g3s18hn-97hp-1gqnyiso13-k01hgzde101
cd62bd32ab1c41883d5d1bfac81bac4c1
-nulo-1
80d83a43b2c97a19f5bfcf1d63aaab21
dd4ae2ad389dbfd07a6368195bf67e31
d6ee8b54487c3cb997ca2dfa4aa8604d11974050491
a5fc223515c5aa4e6ee0408b80e87ff91
u-1g3s18hn-13np-1b7pyirc0m-2af1p2zdcp01
8dcb0c531a0f2997dab7f4b9669c9c501
2N5o/t+2YDW2WBcoDAUm95Sol/M1
aa5f0107517db4371be051e81d8387701
fcf53cc192e4d81fd78360f34173b9571
99dd0b7976bfecce41ee307b5ff4cd0911871199981
5150fae713f4306aa655016f1c32d6c71
49a774f7797266451e7b4bbca61348bf1
eaccelerator-14318789991
10291
a2f4e6d2f5146c0ff0484333e92e61901
59584b8bdef9e2e442f03cf8945a72291
3c36771c2957e19be2403a16a734bb131