Pagefind is quite a find for site search

It used to be that having search on a static site was a hassle — and perhaps an expensive one — but Pagefind has changed all that.

Last modified 2022-08-05

I noted in my summary of the recent HugoConf 2022 event that the host, CloudCannon, had used the online gathering to announce Pagefind. Developed principally by CloudCannon’s Liam Bigelow, Pagefile is a new free/open-source tool for quickly adding site-wide search to a website which, like this one, originates from a static site generator (SSG). Bigelow’s video presentation gave HugoConf “attendees” an introduction to, and demo of, Pagefind:

Note: Clicking the video constitutes your consent to view it via YouTube (including cookies). To view it on the YouTube site instead, please use this link.

While I don’t know whether the Algolia folks are exactly shaking in their boots over Pagefind, perhaps they should be. Even though it’s been available for only a few weeks, it’s already really good. After my initial try-out with Pagefind v.0.4.1 during HugoConf, I pronounced it “code-light and staggeringly fast.” Then, earlier this week, I was even more pleased to see that, with the release of v.0.5.0, Bigelow had added the one change I really wanted to see: the option to keep images out of the search. In 0.4.1, it was easy enough to hide them with CSS, but they still got downloaded. I was looking to avoid that for the sake of leaner browser-side performance and, now, in 0.5.0+, a simple setting lets me choose to do exactly that. With this last gotcha gone, I was pleased to put Pagefind on this site earlier today, even adding “Search” to the nav menu.1

Build before you crawl

One key to using Pagefind, whether in dev mode or on your host, is that it has to run after your site has been really built so there will be real HTML files for Pagefind to “crawl.” This is an important distinction because, when a typical SSG spins up a site in development mode, it keeps everything in RAM and not as real HTML files on disk. For the latter, which Pagefind must have, you first must do an actual build to disk. Pagefind has its own dev server for use in such cases, so you can preview how it’ll look before you push to your host (more on that in a bit).

As of now, Pagefind works only on macOS or Linux (the latter obviously covers just about any web hosting vendor, much less the Jamstack-savvy vendors); there’s not yet a Windows version. (Update, 2022-08-04: Now, there is a Windows version, too.)

You can run Pagefind either by using the following command, which automatically2 installs the latest release:

npx -y pagefind --source "public" --serve

. . . or by downloading its binary and putting it in your system PATH.3 If you do the latter, your command would be simply:

pagefind --source "public" --serve

For my site, I’ve created a shell script for use with Hugo and Pagefind:

rm -rf public
hugo --gc --minify
npx pagefind -y --source "public" --serve

This way, all I have to do is enter ./ in my chosen terminal app and, within a few seconds, Pagefind is showing me a local dev view of my site, with search working, at http://localhost:1414.

Once you’ve satisfied yourself that all is well in dev mode and you’re ready to put Pagefind on your production site, you must alter your hosting process so that on the host, as noted before, Pagefind runs after your SSG builds the site to the appropriate directory for publication — e.g., dist/ for Astro, _site/ for Eleventy, or public/ for Hugo.

Since I’m using a GitHub Action (GHA) to put my site on Cloudflare Pages, all I had to do was add one additional step to that GHA, between the “Build site with Hugo” step and the “Publish to CFP” step:

      - name: Run Pagefind
        run: npx -y pagefind --source "public"

(Obviously, you wouldn’t use Pagefind’s --serve flag here!)

If you’re not using a GHA or other, similar scripting approach, you still should find it easy to add Pagefind to your site-building process. In your chosen host’s GUI, just use && to tack an npx pagefind command onto your site’s usual build command. Here are some examples:

# With Astro
npm run build && npx -y pagefind --source "dist"

# With Eleventy
npm run build && npx -y pagefind --source "_site"

# With Hugo
hugo && npx pagefind -y --source "public"

Update, 2021-07-31

After reading this post, Hugo expert Régis Philibert looked through the Pagefind documentation and came up with a method that allows you to use Pagefind in Hugo’s dev mode. This lets you retain the various advantages of the Hugo dev server, such as live updates as you edit material. For future reference (in case I don’t continue to use the giscus commenting setup, in which he suggested this), I’m recording Philibert’s suggestion here.

Note: Before proceeding, add static/_pagefind to your Hugo project’s .gitignore file.

  1. Generate a build by entering hugo in your terminal app.
  2. From the terminal, run:
    npx -y pagefind --source public --bundle-dir ../static/_pagefind
    The --bundle-dir flag will tell Pagefind to store its “crawl” results in, and source them from, a static/_pagefind directory rather than the default.
  3. Run hugo server and, lo and behold, you’re running the Hugo dev server and you have Pagefind search working, just as in production.

Of course, you should leave the production instructions as previously noted; this is for dev purposes only.

Here’s a shell script version,

npx -y pagefind --source public --bundle-dir ../static/_pagefind
hugo server

Thanks to Philibert for coming up for this — and Rodrigo Alcaraz de la Osa for suggesting that I add it to this post!

Now, back to our regularly scheduled post, still in progress . . .

Fast, fast finds

In my tests a few weeks ago, I ran Pagefind only locally, so this was my first experience with deploying it for real out on Cloudflare Pages. In my own use thus far, Pagefind works very quickly out on the host:

Indexed 269 pages
Indexed 15851 words
Indexed 0 filters
Created 19 index chunks
Finished in 1.360 seconds

As for how quickly it works once the index is there: well, you already saw Bigelow’s video, above, but give my search page a try and see for yourself.

I’d recently been using DuckDuckGo for site search, but Pagefind is so much nicer; and, since it’s just a babe in the woods, it’ll only get better. I strongly suggest you give Pagefind a try, especially if your site is as big as, or larger than, mine. Make it easy on yourself and your visitors, who don’t want to dig through paginated post lists or deal with external search engines when they want to find something on your site. With Pagefind in the SSG user’s toolbox, the power of real site search is now easily available — and you surely can’t beat the price.

  1. The styling you’ll see on the resulting search page is something I’ve supplied to keep its appearance consistent with that of the rest of the site, although Pagefind comes with its own styling CSS if you prefer to use it. ↩︎

  2. The -y flag gives a pre-emptive “Yes” answer to Pagefind’s resulting prompt which asks whether it’s allowed to install itself. ↩︎

  3. If I preferred to use the binary on my Mac, the script’s last line would be just pagefind --source "public" --serve. The advantage of npx pagefind is that you always get the newest version. Its only real disadvantage vs. using the binary is that you must be online to use npx pagefind — although, IMHO, there’s not much point in doing web dev if one isn’t online, so that last item may be of little concern. ↩︎

View/hide comments

Commenting by giscus.