Shopping Product Reviews

XML sitemaps

A Sitemap is the representation of the architecture of the website. It is an easy way for webmasters to inform visitors about the pages that are available on the website and how they are connected and what the navigation structure is, while informing search engines about the pages of the website that are available to track.

Good sitemaps help humans find what they are looking for and help search engines target and manage their crawling activities. A sitemap gives the spider a quick guide to the structure of your website and what has changed since last time. Sitemaps are particularly beneficial on websites:

o Where some areas are not accessible through the user interface

o Where webmasters use AJAX, Flash or RIA that is not processed by search engines.

Sitemap history

1. Google first introduced Sitemaps 0.84 in June 2005 so web developers could post lists of links from all their sites. Engineering director Shivakumar posted on the Google blog: “We’re running an experiment called Google Sitemaps that will either fail miserably or succeed beyond our wildest dreams, making the web better for webmasters and users alike. It’s a beta “ecosystem” that can help webmasters with two current challenges: keeping Google informed of all their new web pages or updates, and increasing the coverage of their web pages in Google’s index. websites.

This project doesn’t just belong to Google: we released it under the Creative Commons Attribution/Share Alike license so that other search engines can do a better job too. Eventually we expect this to be natively supported on web servers (eg Apache, Lotus Notes, IIS). But to get you started, we offer the Sitemap Generator, an open source Python client for calculating sitemaps for some common use cases. Give it a try and give us your feedback.”

2. Google, MSN and Yahoo announced joint support for the Sitemaps protocol in November 2006. The schema version was changed to “Sitemap 0.90”, but no other changes were made.

3. In April 2007, Ask and IBM announced support for Sitemaps. Also, Google, Yahoo, MS announced automatic discovery of sitemaps via robots.txt.

XML Sitemap Format

The sitemap protocol consists of XML tags. All data values ​​in a sitemap must be entity-escaped (described below). The file itself must be encoded in UTF-8. The sitemap should:

1. Start with tag and end with tag.

2. Specify the namespace within the tag.

3. Include an entry for each URL as the main tag.

4. Include a child entry for each parent tag.

All other tags are optional and their use may vary between search engines.

XML tag definitions

1. urlset: This tag is required. It encapsulates the file and references the current protocol standard.

2. URL: This tag is required. Primary label for each entry.

3. loc: This tag is required. Indicates the URL of the web page. It must start with a protocol (such as http) and end with a trailing slash. It must be less than 2048 characters.

4. lastmod: This tag is optional. Defines the last modified date of the file. The date must be in W3C Datetime format.

5. changefreq – This tag is optional. Reports how often the page is likely to change. It provides general information to search engines and does not force them to crawl the page as it changes. Valid values ​​for this are:

Or always

or every hour

or daily

or weekly

or monthly

oh annual

Never

6. Priority: This tag is optional. Describes the priority of a URL relative to other URLs on the website. Its value ranges from 0 to 1. The description of the priorities does not influence the ranking of URLs in search engine results pages.

escaping entity

As described above, the sitemap must be encoded in UTF-8, any data values ​​must use entity escape codes for the characters:

or Ampersand-&

or Single quote – ‘

or Double quotes – ”

or Greater than – >

or less than – Sitemap Index Files

There are two factors that need to be considered when creating a sitemap. They are:

1. The sitemap must not contain more than 50,000 URLs

2. It must not be more than 10 MB.

We can compress the sitemap, but it should not be larger than 10MB uncompressed. If the condition arises that the sitemap has more than 50,000 URLs, we need to create multiple sitemap files. After creating multiple sitemaps, we need to list each one in the sitemap index file. The sitemap index file must:

1. Do not include more than 1000 sitemaps

2. Have no more than 10 MB

The sitemap index file must:

1. Start with the tag and end with sitemapindex > tag.

2. Include an entry for each sitemap as the main tag.

3. Include a child entry for each parent tag.

The optional tag is also available for the sitemap index file.

Sitemap file location

The location of a sitemap determines the set of URLs that can be included in that sitemap. A sitemap file located at http://www.example.com/xyz/sitemap.xml can include any URL beginning with http://www.example.com but cannot include URLs beginning with http:/ /www.example.com /photos/. Therefore, it is highly recommended to place the sitemap file in the root directory of the web server, ie the sitemap file will be located at http://www.example.com/sitemap.xml.

The most important thing to note is that the sitemap file helps to index and not to rank the website. It has been developed to help crawlers know the URLs to be crawled on the website so that those pages can be indexed. It is in no way an aid to improve the ranking of the website on the search engine results page.

Leave a Reply

Your email address will not be published. Required fields are marked *