How to Identify Orphan Pages on Your Site: Quick Wins

Orphan pages are website pages that are not linked to from another section of your site. They cannot be found by search engine crawlers, and they represent missed opportunities to acquire and engage customers. Here’s a quick win recipe for dealing with orphan pages.

How to find and resolve orphan site pages?

You can take an easy 5-step process to identify and address any orphan pages on your site:

  • Get a full list of your current website pages
  • Run a website crawl for pages with zero inbound internal links
  • Analyze the audit results
  • Resolve any orphan page found
  • Rerun the audit periodically to catch new orphan pages

Let’s quickly dive into each of these steps.

1) Get a full list of your current website pages

Pointing your favorite website audit solution to your home page and expecting it to identify orphan site pages won’t work, because, by definition, orphan pages are not linked to from any domain page. The crawler will never find them. Instead, you need to specify the full list of site URLs that the crawler should examine. There are a few ways to get the URL list:

Use your sitemap file

The sitemap is a file that’s typically placed at the root of your domain to help search engine bots understand the content of your site, how often you update it, and how to best surface your content on search engine results pages. When you add a new page or post in your Content Management System (CMS), your sitemap is dynamically updated, but make sure your sitemap contains the full list of your pages before using this technique.

Using BrightEdge ContentIQ, you can configure a new crawl and point it to the sitemap file:

ContentIQ Configure sitemap crawl - in monitor

Download a site URL list

If a sitemap is not an option  — for example, if the sitemap does not contain the full page list — then you can generate the list from your CMS. On WordPress, for example, you can have install a lightweight plugin, such as List URLs to export a list of site URLs as a CSV file.

You can also ask your IT to give you a copy of the CMS log that lists all the pages that were served to your visitors. Load the list into Excel and filter on unique URLs.
Once you have the full list, paste it into your crawl configuration:

ContentIQ - Configure Crawl Specific Pages - in monitor

2) Run a website crawl for pages with zero inbound internal links

To identify orphan pages, set up the audit rule to catch pages that don’t have at least one inbound internal link.

While configuring the audit, set up a recurring crawl to catch any new orphan pages in the future. Note that if you are relying on URL list then you’ll want to get an update list from your CMS.

ContentIQ - Configure zero inbound internal links - in monitor

3) Analyze the audit results

Once the audit is complete, log back into ContentIQ and view the results of your audit. Identify the orphan pages and determine their objectives: Are they actively driving referral, paid or social campaign traffic? Do they have quality backlinks? Do site visitors access them extensively via onsite search?

 

ContentIQ - orphan pages audit results - in monitor

 

Use your web analytics solution to assess traffic sources, visits and page views, entry, and exit behaviors. In the example below, we see an example of a campaign page that was helping acquire traffic for a given time. Once the campaign ended, the page no longer attracts traffic, and can be removed from the site:

 

ContentIQ - GA showing traffic to an orphaned page - in monitor

4) Resolve any orphan page found

Once you understand what purpose the orphan page serves and how it aids in driving your website and marketing goals, you can determine what step if any to take with the page:

  • Link to it from other internal pages if it’s imperative for site visitors to find it via browsing
  • Archive it if it’s no longer needed
  • Leave it as-is if it’s serving a business need that doesn’t require internal linking to the page

 

5) Rerun the audit periodically to catch new orphan pages

Since pages can become orphaned over time  — by adding new content and forgetting to link to it, or by accidentally removing links to pages nested deep in the site structure  — it is important to check the site periodically for new issues.

As noted earlier, you can configure ContentIQ to periodically rerun your audit by scheduling a crawl:

 

ContentIQ Schedule crawl

 

That’s the end of our Quick Win recipe for identifying and addressing orphan site pages.

 

site migration seo guide banner