How to find that in how many websites h5 & h6 is being used?
response_bodies tables on BigQuery contain the raw HTML for each web page (in addition to other text-based resources like JS and CSS).
You can select the number of distinct pages
WHERE body LIKE '%<h6%' or similar, perhaps also taking
H6-style capitalization into account.
That should be enough to get started, but let me know if you need help writing the query.
Be aware that the response bodies are very large and the entire dataset is 853 GB, so make sure you have enough free BigQuery quota.