如何获取api 报文数据,进行学习?

从主流网站采集的数据可以开源,进行研究学习吗?

Assuming Google translate got this correctly as “Can the data collected from mainstream websites be open sourced for research and study?”, what data, specifically?

All of the data that the HTTP Archive collects is freely available in bigquery (though you’ll have to pay query costs if it exceeds the defaults) and the raw HARs are all available in cloud storage if you want to do some offline processing of some kind.

That said, the underlying copyright for the content on the pages still belongs to the websites in question so I wouldn’t recommend doing something like trying to use the HTTP Archive to trail a gen AI model without making sure the copyrights for the content you would be using allow for that.