How to download full-size scans from Gallica.BnF.fr
Online since 1997, Gallica is the digital library of the National Library of France (Bibliothèque nationale de France) and its partners. Among the 4 million available documents, some are scanned newspapers, magazines, and other publications that are scanned at good resolutions, but can only be downloaded in mediocre resolutions–at least, officially.
A few hours ago, this YouTube video about a crazy idea of Gianni Caproni (more info in Italian) made me want to consult the corresponding page in Le Petit Journal illustré n°1579 shown in the video. A quick search revealed it’s the issue of March, 27, 1921. Obviously, this is not about the daily Le Petit Journal, but about the Sunday Le Petit Journal illustré.
And here’s the last page of this issue, on Gallica: https://gallica.bnf.fr/ark:/12148/bpt6k717464d/f12.item.zoom
Now, should I want to download it, the options are not as good as they look like:
Should I opt for the JPEG, its width is limited to 1024 pixels. Should I request a PDF, the embedded pages are JPEGs with the same size limitations. Selecting “part of an image in higher/displayed resolution” is highly impractical, as only small cropped portions of the displayed image can be downloaded. What is to be done?
Well, supposing you’re in Chrome, open the Inspector (CTRL+SHIFT+I), go to the Application tab (in the top part), then to Frames, Images, and look for the images called “native.jpg“; supposing you’ve already zoomed in the picture, you’ll see several of them, and all but one are not what you want:
This is the [only] one you want:
Now, right-click, Open in new tab. Disappointingly, you’ll get a tiny image limited in this case to a width of 217: https://gallica.bnf.fr/iiif/ark:/12148/bpt6k717464d/f12/0,0,3468,4096/217,/0/native.jpg
Notice the native size which is hidden in the URL (in our case, 3468×4096) and modify the URL to display the truly native picture by setting the native width: https://gallica.bnf.fr/iiif/ark:/12148/bpt6k717464d/f12/0,0,3468,4096/3468,/0/native.jpg
LATE EDIT: Read the second comment below; the true full URL is better put as https://gallica.bnf.fr/iiif/ark:/12148/bpt6k717464d/f12/full/full/0/native.jpg.
Of course, Gallica doesn’t offer that much in terms of scanned French periodicals, and there are other public & free sources as well, such as Heidelberg University’s Digital Library, which includes some French titles of possible interest. In this case, no tricks are needed.
Learning to play with URLs can be very useful though. E.g. on blogs you’ll find images that seem pretty small, and if you’ll try opening them in a new tab, you might see one of these kind of URLs:
To get the full size, you need to remove the parts in bold:
Why are people so stupid or unimaginative that they don’t even try such things? Homo stupidus stupidus…
A last place where fiddling with URLs can be useful is Amazon. Suppose I’m reading a customer review on Amazon, such as this one. Should I click on any of the pictures, a pop-up opens, showing them at a larger size; but this is not their original size!
To get the original size, right-click on a thumbnail and open it in a new tab. Say this: https://images-na.ssl-images-amazon.com/images/I/71CVsvz1jfL._SY88.jpg
Now, remove anything between the last dot and the “.jpg” extension to get the original site: https://images-na.ssl-images-amazon.com/images/I/71CVsvz1jfL.jpg
A more complex case: somewhere, in an Amazon page, there is this cropped thumbnail: https://images-na.ssl-images-amazon.com/images/I/61cpRfCXMEL._CR0,204,1224,1224_UX175.jpg
Apply the aforementioned rule to retrieve the full image: https://images-na.ssl-images-amazon.com/images/I/61cpRfCXMEL.jpg
Another case: Google Images gives you the following result on Amazon: https://m.media-amazon.com/images/S/aplus-media/vc/6049fbd5-1322-473b-ba2f-4f893b65e36c.__CR0,0,1000,1000_PT0_SX300_V1___.jpg
Yes, it’s too small, for the real one is this one: https://m.media-amazon.com/images/S/aplus-media/vc/6049fbd5-1322-473b-ba2f-4f893b65e36c.jpg
The most typical for Amazon is however the limitation (actually, resizing, as sometimes it works even through upscaling) of the covers to 1000, 1200 or 1500 pixels on the long size, by inserting before the extension of the string
.AC_SL1500 (for limiting to 1500 pixels).
Since we’re at Amazon and URLs, please learn, stupid people of planet Earth, that an Amazon URL needs not be like this: https://www.amazon.de/-/en/Francesco-Morini-ebook/dp/B06WGK1HFW/ref=pd_ys_sa_530886031_25?_encoding=UTF8&pd_rd_i=B06WGK1HFW&pd_rd_r=2TK0BS7TZPQGSCCFXEQA&pd_rd_w=HgsJr&pd_rd_wg=8DHnk&pf_rd_p=b590e2fb-7f7b-462f-b256-26d8215ddf09&psc=1&refRID=2TK0BS7TZPQGSCCFXEQA
A link to a product needs to be in one of the two canonical forms:
Note that in many cases, replacing “.de” with any of “.co.uk” / “.fr” / “.it” / “.es” / “.com” in a product’s URL would give you the corresponding page in the respective national Amazon site, should the exact product be present there too.
A trick for the German Amazon.de: to have the page contents as much as possible in English, and to have a link that proposes you to “Translate all reviews to English” (and on each review, “Translate review to English”), insert the substring “/-/en” in the URL like so: https://www.amazon.de/-/en/dp/383657487X/
This being said, people (and even mainstream news outlets!) are still going to post URLs that stupidly carry the hallmark of Facebook, i.e. have appended shit like “?fbclid=IwAR2HHpIoNRTDnRI75mf_4FNQ34-tmCfdf9d0ZB1m-mOdz3EDTis8Xo6imBo” that nobody could be bothered to delete…