Hi I am trying to run some analysis on image data. More specially I want the image’s type, file size, and image size.
I managed to get the first two simply by querying a requests table:
mimeType CONTAINS ‘image’
But is there a simple way to get the image size? Or I will have to fetch each image myself to achieve so?
The Archive does not have the image dimensions, but you could get the image url, and then query the urls with imagemagick:
magick identify http://res.cloudinary.com/dougsillars/image/upload/v1532673490/IMG_20150625_192917267_o4bvyk.jpg
gives the response:
http://res.cloudinary.com/dougsillars/image/upload/v1532673490/IMG_20150625_192917267_o4bvyk.jpg=>IMG_20150625_192917267_o4bvyk.jpg JPEG **4160x2340** 4160x2340+0+0 8-bit sRGB 2655710B 0.000u 0:00.049
so something like:
xargs -n 1 magick identify < listofimageurls.csv -of csv >> output.csv
will query all of the urls, and give you a formatted CSV with the data you are looking for. You might even optimise the imagemagick query to get more detailed information about each image.
Also - [httparchive:runs.latest_requests] is no longer updated, and will be data from February 2018. You want to be using httparchive:summary_requests.2018_07_15_mobile (or desktop) to look at recent requests for images.