The pages table contains one row for every web page tested. The latest crawl has about 500,000 pages. For example, the row where
url = 'http://www.microsoft.com/' has a unique page ID of 78071569. The row contains summary statistics about the page.
The requests table contains one row for every request in a page's test. The latest crawl has about 50,000,000 requests, or an average of 100 requests per page. There are 167 rows having Microsoft's page ID, and each of those have their own unique request ID.
Here's a graphical explanation:
So if you wanted to find out the server software used for Microsoft's home page, you could do something like this:
firstHtml = true AND
pageid = 78071569
resp_server field corresponds with the
Server response header. Be aware that it's not a required header and many websites omit it from the response, so it would be empty when queried.