Crawling URLs on your site may encounter a crawler hat. You can go to the Google Search Console’s (Search Console) “Crawl Errors” report to identify URLs where this might happen – this report dimension will show server errors and “not found” errors.
However, in the file of the server log files and a more advanced tactic to study them, I will cover this in detail in another article.
Before doing anything to report a crawl error, it is important to understand server errors and “not found” errors.
4xx Codes: When search engine crawlers can’t access your content due to an image error
They are 4xx errors, which means the domain URL cannot be found.
One of the most common 4xx errors is a “404 – not found” error. These could be in the case of a URL typo, deleted page, or broken redirect, to name a few examples. When search engines reach 404, they cannot access the URL. When users reach a 404 page, they can leave your site.
5xx Codes: When search engine crawlers can’t access your content on a server error
They give a 5xx error, that is, where the web page is located, the searcher or could not fulfill the search engine’s request to access the page.
There is a dedicated tab for these errors in Google Search Console’s “Crawl Error” report. These are usually because the URL request times out and this is caused by Googlebot querying. See Google Docs.
Fortunately, there is a way to tell both searchers and search engines that your page has been moved – a 301 (permanent) redirect.
Suppose you move a page from example.com/seo/ to example.com/search-engine-optimization/. Search engines and users need a hyperlink to switch from the old URL to the new one. This bridge is a 301 redirect.
The 301 status code itself means that the page has been permanently moved to a new location, so avoid redirecting URLs to irrelevant pages, i.e. URLs where the old URL content doesn’t actually exist. If a page is rated for a query and you put it 301 on a URL with different content, our ranking position may also drop because the content that makes it relevant to that query is no longer there. Handle URLs responsibly!
You also have the option to redirect a page 302, but this should be reserved for temporary movements and in cases where crossing connection equality is not an issue. The 302s are a kind of winding road. You are pulling traffic from a particular route temporarily, but that means it won’t stay that way forever.
Beware of redirect chains!
It can be difficult for Googlebot to reach your page if you need to do multiple redirects. Google calls this “redirect chains” and recommends limiting it as much as possible. If you redirect example.com/1 to example.com/2, then decide to redirect it to example.com/3, it’s best to eliminate the broker and send example.com/1 to example.com/3. to direct.
Once you’ve made sure your site is optimized for crawlability, the next job is to make sure your site can be indexed. I will try to explain this subject in a different article for you.