Skip to content

SvelteKit sitemap focused on ease of use and making it impossible to forget to add your paths.

License

Notifications You must be signed in to change notification settings

jasongitmail/super-sitemap

Repository files navigation

Svelte Super Sitemap logo

Super Sitemap

unit tests badge license badge npm badge

SvelteKit sitemap focused on ease of use and making it impossible to forget to add your paths.

Table of Contents

Features

  • 🤓 Supports any rendering method.
  • 🪄 Automatically collects routes from /src/routes using Vite + data for route parameters provided by you.
  • 🧠 Easy maintenance–accidental omission of data for parameterized routes throws an error and requires the developer to either explicitly exclude the route pattern or provide an array of data for that param value.
  • 👻 Exclude specific routes or patterns using regex patterns (e.g. ^/dashboard.*, paginated URLs, etc).
  • 🚀 Defaults to 1h CDN cache, no browser cache.
  • 💆 Set custom headers to override default headers: sitemap.response({ headers: {'cache-control: '...'}, ...}).
  • 🫡 Uses SvelteKit's recommended sitemap XML structure.
  • 💡 Google, and other modern search engines, ignore priority and changefreq and use their own heuristics to determine when to crawl pages on your site. As such, these properties are not included by default to minimize KB size and enable faster crawling. Optionally, you can enable them like so: sitemap.response({ changefreq:'daily', priority: 0.7, ...}).
  • 🧪 Well tested.
  • 🫶 Built with TypeScript.
  • 🗺️ (Nearly automatic) sitemap indexes!

Limitations

  • Excludes lastmod from each item, but a future version could include it for parameterized data items. Obviously, lastmod would be indeterminate for non-parameterized routes, such as /about. Due to this, Google would likely ignore lastmod anyway since they only respect if it's "consistently and verifiably accurate".
  • Image or video sitemap extensions.

Installation

npm i -D super-sitemap

or

bun add -d super-sitemap

Then see the Usage, Robots.txt, & Playwright Test sections.

Usage

Basic example

JavaScript:

// /src/routes/sitemap.xml/+server.js
import * as sitemap from 'super-sitemap';

export const GET = async () => {
  return await sitemap.response({
    origin: 'https://example.com'
  });
};

TypeScript:

// /src/routes/sitemap.xml/+server.ts
import * as sitemap from 'super-sitemap';
import type { RequestHandler } from '@sveltejs/kit';

export const GET: RequestHandler = async () => {
  return await sitemap.response({
    origin: 'https://example.com'
  });
};

The "everything" example

All aspects of the below example are optional, except for origin and paramValues to provide data for parameterized routes.

JavaScript:

// /src/routes/sitemap.xml/+server.js
import * as sitemap from 'super-sitemap';
import * as blog from '$lib/data/blog';

export const prerender = true; // optional

export const GET = async () => {
  // Get data for parameterized routes
  let blogSlugs, blogTags;
  try {
    [blogSlugs, blogTags] = await Promise.all([blog.getSlugs(), blog.getTags()]);
  } catch (err) {
    throw error(500, 'Could not load data for param values.');
  }

  return await sitemap.response({
    origin: 'https://example.com',
    excludePatterns: [
      '^/dashboard.*',          // i.e. routes starting with `/dashboard`
      '.*\\[page=integer\\].*', // i.e. routes containing `[page=integer]`–e.g. `/blog/2`
      '.*\\(authenticated\\).*' // i.e. routes within a group
    ],
    paramValues: {
      '/blog/[slug]': blogSlugs, // e.g. ['hello-world', 'another-post']
      '/blog/tag/[tag]': blogTags, // e.g. ['red', 'green', 'blue']
      '/campsites/[country]/[state]': [
        ['usa', 'new-york'],
        ['usa', 'california'],
        ['canada', 'toronto']
      ]
    },
    headers: {
      'custom-header': 'foo' // case insensitive; xml content type & 1h CDN cache by default
    },
    additionalPaths: [
      '/foo.pdf' // e.g. to a file in your static dir
    ],
    changefreq: 'daily', // excluded by default b/c ignored by modern search engines
    priority: 0.7, // excluded by default b/c ignored by modern search engines
    sort: 'alpha' // default is false; 'alpha' sorts all paths alphabetically.
  });
};

TypeScript:

// /src/routes/sitemap.xml/+server.ts
import type { RequestHandler } from '@sveltejs/kit';
import * as sitemap from 'super-sitemap';
import * as blog from '$lib/data/blog';

export const prerender = true; // optional

export const GET: RequestHandler = async () => {
  // Get data for parameterized routes
  let blogSlugs, blogTags;
  try {
    [blogSlugs, blogTags] = await Promise.all([blog.getSlugs(), blog.getTags()]);
  } catch (err) {
    throw error(500, 'Could not load data for param values.');
  }

  return await sitemap.response({
    origin: 'https://example.com',
    excludePatterns: [
      '^/dashboard.*',          // i.e. routes starting with `/dashboard`
      '.*\\[page=integer\\].*', // i.e. routes containing `[page=integer]`–e.g. `/blog/2`
      '.*\\(authenticated\\).*' // i.e. routes within a group
    ],
    paramValues: {
      '/blog/[slug]': blogSlugs, // e.g. ['hello-world', 'another-post']
      '/blog/tag/[tag]': blogTags, // e.g. ['red', 'green', 'blue']
      '/campsites/[country]/[state]': [
        ['usa', 'new-york'],
        ['usa', 'california'],
        ['canada', 'toronto']
      ]
    },
    headers: {
      'custom-header': 'foo' // case insensitive; xml content type & 1h CDN cache by default
    },
    additionalPaths: [
      '/foo.pdf' // e.g. to a file in your static dir
    ],
    changefreq: 'daily', // excluded by default b/c ignored by modern search engines
    priority: 0.7, // excluded by default b/c ignored by modern search engines
    sort: 'alpha' // default is false; 'alpha' sorts all paths alphabetically.
  });
};

Sitemap Index

You can enable sitemap index support with just two changes:

  1. Rename your route to sitemap[[page]].xml
  2. Pass the page param via your sitemap config

JavaScript:

// /src/routes/sitemap[[page]].xml/+server.js
import * as sitemap from 'super-sitemap';

export const GET = async ({ params }) => {
  return await sitemap.response({
    origin: 'https://example.com',
    page: params.page
    // maxPerPage: 45_000 // optional; defaults to 50_000
  });
};

TypeScript:

// /src/routes/sitemap[[page]].xml/+server.ts
import * as sitemap from 'super-sitemap';
import type { RequestHandler } from '@sveltejs/kit';

export const GET: RequestHandler = async ({ params }) => {
  return await sitemap.response({
    origin: 'https://example.com',
    page: params.page
    // maxPerPage: 45_000 // optional; defaults to 50_000
  });
};

Feel free to always set up your sitemap in this manner given it will work optimally whether you have few or many URLs.

Your sitemap.xml route will now return a regular sitemap when your sitemap's total URLs is less than or equal to maxPerPage (defaults to 50,000 per the sitemap protocol) or it will contain a sitemap index when exceeding maxPerPage.

The sitemap index will contain links to sitemap1.xml, sitemap2.xml, etc, which contain your paginated URLs automatically.

<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <sitemap>
    <loc>https://example.com/sitemap1.xml</loc>
  </sitemap>
  <sitemap>
    <loc>https://example.com/sitemap2.xml</loc>
  </sitemap>
  <sitemap>
    <loc>https://example.com/sitemap3.xml</loc>
  </sitemap>
</sitemapindex>

Sampled URLs

Sampled URLs provides a utility to obtain a sample URL for each unique route on your site–i.e.:

  1. the URL for every static route (e.g. /, /about, /pricing, etc.), and
  2. one URL for each parameterized route (e.g. /blog/[slug])

This can be helpful for writing functional tests, performing SEO analyses of your public pages, & similar.

This data is generated by analyzing your site's sitemap.xml, so keep in mind that it will not contain any URLs excluded by excludePatterns in your sitemap config.

import { sampledUrls } from 'super-sitemap';

const urls = await sampledUrls('http://localhost:5173/sitemap.xml');
// [
//   'http://localhost:5173/',
//   'http://localhost:5173/about',
//   'http://localhost:5173/pricing',
//   'http://localhost:5173/features',
//   'http://localhost:5173/login',
//   'http://localhost:5173/signup',
//   'http://localhost:5173/blog',
//   'http://localhost:5173/blog/hello-world',
//   'http://localhost:5173/blog/tag/red',
// ]

Limitations

  1. Result URLs will not include any additionalPaths from your sitemap config because it's impossible to identify those by a pattern given only your routes and sitemap.xml as inputs.
  2. sampledUrls() does not distinguish between routes that differ only due to a pattern matcher. For example, /foo/[foo] and /foo/[foo=integer] will evaluated as /foo/[foo] and one sample URL will be returned.

Designed as a testing utility

Both sampledUrls() and sampledPaths() are intended as utilities for use within your Playwright tests. Their design aims for developer convenience (i.e. no need to set up a 2nd sitemap config), not for performance, and they require a runtime with access to the file system like Node, to read your /src/routes. In other words, use for testing, not as a data source for production.

You can use it in a Playwright test like below, then you'll have sampledPublicPaths available to use within your tests in this file.

// foo.test.js
import { expect, test } from '@playwright/test';
import { sampledPaths } from 'super-sitemap';

let sampledPublicPaths = [];
try {
  sampledPublicPaths = await sampledPaths('http://localhost:4173/sitemap.xml');
} catch (err) {
  console.error('Error:', err);
}

// ...

Sampled Paths

Same as Sampled URLs, except it returns paths.

import { sampledPaths } from 'super-sitemap';

const urls = await sampledPaths('http://localhost:5173/sitemap.xml');
// [
//   '/about',
//   '/pricing',
//   '/features',
//   '/login',
//   '/signup',
//   '/blog',
//   '/blog/hello-world',
//   '/blog/tag/red',
// ]

Robots.txt

It's important to create a robots.txt so search engines know where to find your sitemap.

You can create it at /static/robots.txt:

User-agent: *
Allow: /

Sitemap: https://example.com/sitemap.xml

Or, at /src/routes/robots.txt/+server.ts, if you have defined PUBLIC_ORIGIN within your project's .env and want to access it:

import * as env from '$env/static/public';

export const prerender = true;

export async function GET(): Promise<Response> {
  // prettier-ignore
  const body = [
    'User-agent: *',
    'Allow: /',
    '',
    `Sitemap: ${env.PUBLIC_ORIGIN}/sitemap.xml`
  ].join('\n').trim();

  const headers = {
    'Content-Type': 'text/plain'
  };

  return new Response(body, { headers });
}

Playwright Test

It's recommended to add a Playwright test that calls your sitemap.

For pre-rendered sitemaps, you'll receive an error at build time if your data param values are misconfigured. But for non-prerendered sitemaps, your data is loaded when the sitemap is loaded, and consequently a functional test is more important to confirm you have not misconfigured data for your param values.

Feel free to use or adapt this example test:

// /src/tests/sitemap.test.js

import { expect, test } from '@playwright/test';

test.only('/sitemap.xml is valid', async ({ page }) => {
  const response = await page.goto('/sitemap.xml');
  expect(response.status()).toBe(200);

  // Ensure XML is valid. Playwright parses the XML here and will error if it
  // cannot be parsed.
  const urls = await page.$$eval('url', (urls) =>
    urls.map((url) => ({
      loc: url.querySelector('loc').textContent
      // changefreq: url.querySelector('changefreq').textContent, // if you enabled in your sitemap
      // priority: url.querySelector('priority').textContent,
    }))
  );

  // Sanity check
  expect(urls.length).toBeGreaterThan(5);

  // Ensure entries are in a valid format.
  for (const url of urls) {
    expect(url.loc).toBeTruthy();
    expect(() => new URL(url.loc)).not.toThrow();
    // expect(url.changefreq).toBe('daily');
    // expect(url.priority).toBe('0.7');
  }
});

Querying your database for param values

As a helpful tip, below are a few examples demonstrating how to query an SQL database to obtain data to provide as paramValues for your routes:

-- Route: /blog/[slug]
SELECT slug FROM blog_posts WHERE status = 'published';

-- Route: /blog/category/[category]
SELECT DISTINCT LOWER(category) FROM blog_posts WHERE status = 'published';

-- Route: /campsites/[country]/[state]
SELECT DISTINCT LOWER(country), LOWER(state) FROM campsites;

Using DISTINCT will prevent duplicates in your result set. Use this when your table could contain multiple rows with the same params, like in the 2nd and 3rd examples. This will be the case for routes that show a list of items.

Then if your result is an array of objects, convert into an array of arrays of string values:

const arrayOfArrays = resultFromDB.map((row) => Object.values(row));
// [['usa','new-york'],['usa', 'california']]

That's it.

Going in the other direction, i.e. when loading data for a component for your UI, your database query should typically lowercase both the URL param and value in the database during comparison–e.g.:

-- Obviously, remember to escape your `params.slug` values to prevent SQL injection.
SELECT * FROM campsites WHERE LOWER(country) = LOWER(params.country) AND LOWER(state) = LOWER(params.state) LIMIT 10;

Example output

<urlset
    xmlns="https://www.sitemaps.org/schemas/sitemap/0.9"
    xmlns:news="https://www.google.com/schemas/sitemap-news/0.9"
    xmlns:xhtml="https://www.w3.org/1999/xhtml"
    xmlns:mobile="https://www.google.com/schemas/sitemap-mobile/1.0"
    xmlns:image="https://www.google.com/schemas/sitemap-image/1.1"
    xmlns:video="https://www.google.com/schemas/sitemap-video/1.1">
    <url>
        <loc>https://example/</loc>
        <changefreq>daily</changefreq>
        <priority>0.7</priority>
    </url>
    <url>
        <loc>https://example/about</loc>
        <changefreq>daily</changefreq>
        <priority>0.7</priority>
    </url>
    <url>
        <loc>https://example/blog</loc>
        <changefreq>daily</changefreq>
        <priority>0.7</priority>
    </url>
    <url>
        <loc>https://example/login</loc>
        <changefreq>daily</changefreq>
        <priority>0.7</priority>
    </url>
    <url>
        <loc>https://example/pricing</loc>
        <changefreq>daily</changefreq>
        <priority>0.7</priority>
    </url>
    <url>
        <loc>https://example/privacy</loc>
        <changefreq>daily</changefreq>
        <priority>0.7</priority>
    </url>
    <url>
        <loc>https://example/signup</loc>
        <changefreq>daily</changefreq>
        <priority>0.7</priority>
    </url>
    <url>
        <loc>https://example/support</loc>
        <changefreq>daily</changefreq>
        <priority>0.7</priority>
    </url>
    <url>
        <loc>https://example/terms</loc>
        <changefreq>daily</changefreq>
        <priority>0.7</priority>
    </url>
    <url>
        <loc>https://example/blog/hello-world</loc>
        <changefreq>daily</changefreq>
        <priority>0.7</priority>
    </url>
    <url>
        <loc>https://example/blog/another-post</loc>
        <changefreq>daily</changefreq>
        <priority>0.7</priority>
    </url>
    <url>
        <loc>https://example/blog/tag/red</loc>
        <changefreq>daily</changefreq>
        <priority>0.7</priority>
    </url>
    <url>
        <loc>https://example/blog/tag/green</loc>
        <changefreq>daily</changefreq>
        <priority>0.7</priority>
    </url>
    <url>
        <loc>https://example/blog/tag/blue</loc>
        <changefreq>daily</changefreq>
        <priority>0.7</priority>
    </url>
    <url>
        <loc>https://example/campsites/usa/new-york</loc>
        <changefreq>daily</changefreq>
        <priority>0.7</priority>
    </url>
    <url>
        <loc>https://example/campsites/usa/california</loc>
        <changefreq>daily</changefreq>
        <priority>0.7</priority>
    </url>
    <url>
        <loc>https://example/campsites/canada/toronto</loc>
        <changefreq>daily</changefreq>
        <priority>0.7</priority>
    </url>
    <url>
        <loc>https://example/foo.pdf</loc>
        <changefreq>daily</changefreq>
        <priority>0.7</priority>
    </url>
</urlset>

Changelog

  • 0.14.0 - Adds sitemap index support.
  • 0.13.0 - Adds sampledUrls() and sampledPaths().
  • 0.12.0 - Adds config option to sort 'alpha' or false (default).
  • 0.11.0 - BREAKING: Rename to super-sitemap on npm! 🚀
  • 0.10.0 - Adds ability to use unlimited dynamic params per route! 🎉
  • 0.9.0 - BREAKING: Adds configurable changefreq and priority and excludes these by default. See the README's features list for why.
  • 0.8.0 - Adds ability to specify additionalPaths that live outside /src/routes, such as /foo.pdf located at /static/foo.pdf.

Contributing

git clone https://github.com/jasongitmail/super-sitemap.git
bun install
# Then edit files in `/src/lib`

Publishing

A new version of this npm package is automatically published when the semver version within package.json is incremented.

Credits