A sitemap serves as a navigational blueprint for search engines, ensuring they can efficiently crawl and index all essential pages of a website. By providing a structured list of URLs, a sitemap streamlines the discoverability of content, especially in complex or extensive sites. This not only optimizes search engine ranking and visibility but also ensures that any updates or new content additions are promptly recognized and indexed, thereby enhancing the site’s overall accessibility and user experience.
🚀 TL;DR See links below.
Install & Configure
The gatsby-plugin-sitemap plugin makes it simple to add a sitemap. We’ll build on the site created in a previous post. Add gatsby-plugin-sitemap
as a dependency in package.json
.
Add gatsby-plugin-sitemap
to gatsby-config.js
as a plugin:
plugins: [
`gatsby-plugin-sitemap`
]
Build
Build the site then look in public/
. You’ll find (at least) two new files:
sitemap-index.xml
andsitemap-0.xml
.
If you have a large site then there may be more files named sitemap-1.xml
, sitemap-2.xml
, sitemap-3.xml
etc. These files will contain the actual sitemap data, which is essentially just a list of all of the pages in the site. The sitemap-index.xml
file acts as an index to these files.
🚀 TL;DR
Show me the code. Look at the 3-sitemap-simple
branch.
Tweaking
In many cases the default configuration will work perfectly. However, if you need to do something more niche then there are ample ways to tweak the way that this plugin works.
GraphQL
The default GraphQL query for the sitemap is:
{
site {
siteMetadata {
siteUrl
}
}
allSitePage {
nodes {
path
}
}
}
If the sitemap you get out of the box doesn’t quite meet your needs then you might need to tweak this query. You might also want to filter the pages included in the sitemap. Both of these can be achieved by tweaking gatsby-config.js
.
GraphQL Query Returns Edges
Suppose that for some reason we wanted the GraphQL query to return edges rather than nodes.
{
site {
siteMetadata {
siteUrl
}
}
allSitePage {
edges {
node {
path
}
}
}
}
No problem, just specify query
in the plugin configuration. However, the plugin still expects to receive a list of nodes, so we also need to use resolvePages
to provide a mapping from edges to nodes.
plugins: [
{
resolve: "gatsby-plugin-sitemap",
options: {
output: "/",
query: `
{
site {
siteMetadata {
siteUrl
}
}
allSitePage {
edges {
node {
path
}
}
}
}`,
resolvePages: data => data.allSitePage.edges.map(edge => edge.node),
},
}
]
🚀 TL;DR
Show me the code. Look at the 4-sitemap-graphql-edges
branch.
Exclude Drafts
If some pages are marked as draft then we probably don’t want those to appear in the sitemap. Drafts can be flagged by adding a page-draft
field to the AsciiDoc header. And that field can then be extracted via GraphQL.
{
site {
siteMetadata {
siteUrl
}
}
allAsciidoc {
nodes {
pageAttributes {
draft
}
fields {
slug
}
}
}
}
This is what the GraphQL result looks like:
{
"site": {
"siteMetadata": {
"siteUrl": "https://www.whimsyweb.dev"
}
},
"allAsciidoc": {
"nodes": [
{
"pageAttributes": {
"draft": null
},
"fields": {
"slug": "/what-is-asciidoc/"
}
},
{
"pageAttributes": {
"draft": "true"
},
"fields": {
"slug": "/what-is-gatsby/"
}
},
{
"pageAttributes": {
"draft": null
},
"fields": {
"slug": "/what-is-tailwind/"
}
}
]
}
}
Update the plugin specification.
plugins: [
{
resolve: "gatsby-plugin-sitemap",
options: {
query: `
{
site {
siteMetadata {
siteUrl
}
}
allAsciidoc {
nodes {
pageAttributes {
draft
}
fields {
slug
}
}
}
}`,
resolvePages: data => sitemapQuery(data),
serialize: ({ path }) => {
return {
url: path,
changefreq: "monthly",
priority: 0.5,
};
}
}
}
]
The sitemapQuery()
function (invoked in the resolvePages
field) filters out the draft posts. It also manually adds in an item for the site landing page that’s not included via the GraphQL query.
function sitemapQuery(data) {
const posts = data.allAsciidoc.nodes.filter(
// Exclude draft posts.
node => node.pageAttributes.draft == null
).map(
(node) => ({
path: node.fields.slug
})
);
// Add landing page manually since it's not included in the GraphQL results.
const home = {
path: '/'
}
return [...posts, home];
}
🚀 TL;DR
Show me the code. Look at the 5-sitemap-filter-draft
branch.
Conclusion
Adding a sitemap to your site is quite likely to improve its SEO performance. You certainly have nothing to lose! The gatsby-plugin-sitemap
plugin can be used to add a quick and dirty sitemap that includes all pages on the site. Alternatively, there are a variety of options that make it possible to customise the sitemap to precisely your requirements.
The code for this post can be found here.