We've been inundated with inquiries over the past few days regarding a recent revision to our Googlebot documentation. In particular, we've shown that when downloading particular file types, Googlebot only ever "sees" the first 15 megabytes (MB). This criterion has been in place for a long time. We recently included it to our documentation because it rarely changes and because some people might find it useful when debugging.
The referenced resources on the page are not included in this limit; rather, it only pertains to the bytes (content) received for the initial request made by Googlebot. For instance, when you open https://example.com/puppies.html, your browser will first download the HTML's bytes. Then, based on those bytes, it may request other files, such as external JavaScript, graphics, or other resources that the HTML links to. The Googlebot acts similarly.
How does the 15 MB restriction affect me?
Probably nothing. On the internet, there aren't many pages that are larger. Given that an HTML file's average size is only 30 kilobytes, you, dear reader, are probably not the owner of one (kB). However, if you are the creator of an HTML page that is larger than 15 MB, please consider moving some inline scripts and CSS clutter to other files.
After 15 MB, what happens to the content?
Only the first 15 MB of content is forwarded to indexing; anything after that is dropped by Googlebot.
What forms of content are subject to the 15 MB restriction?
When Googlebot (Googlebot Desktop and Googlebot Smartphone) downloads files that are compatible with Google Search, the 15 MB limit is in effect.
Does this imply that Googlebot is unable to view my image or video?
No, Googlebot retrieves the photos and videos that are referred to in HTML using a URL <img src="https://example.com/images/puppy.jpg" alt="cute puppy looking very disappointed" /> individually over the course of several fetches.
Do data URIs increase the size of the HTML file?
Yes. Since data URIs are contained within the HTML file, their use will increase its size.
How can I find out a page's dimensions?
There are other options, but utilising your own browser and its Developer Tools is probably the simplest. Launch the Developer Tools, then select the Network tab after loading the website as usual. You should be able to see all the requests your browser has to make when you reload the page. What you're searching for is in the top request, which includes the page's byte count in the Size field.
Source: Google Search Central
The 15 MB item and the Googlebot |
Comments
Post a Comment