Here are some links for more context behind the dataset:
To your questions:
The data is only as good as its detection. We use Wappalyzer to do the detection in WebPageTest. And while we’re reasonably confident in the detection, there may be some blind spots, for example frameworks that hide their presence from the global scope. @developit has made some efforts to improve this detection upstream in Wappalyzer.
I think the data is solid enough to do an analysis of framework performance (eg see CMS Performance - #16 by rviscomi for similar analysis).
Are you looking up the right URL? For example this query shows 78 Amazon URLs in the most recent technologies table:
SELECT
DISTINCT url
FROM
`httparchive.technologies.2019_02_01_desktop`
WHERE
NET.REG_DOMAIN(url) = 'amazon.com'
and there are some entries for https://www.amazon.com/ specifically (AWS and Cloudfront).