Orphan pages are website pages that are not linked to from any other page or section of your site. This means a user cannot access the page without knowing the direct URL. Additionally, these pages can’t be followed from another page by search engine crawlers, which means they are rarely indexed by search engines. In order for crawlers to find your pages, they need to be linked to other pages. Think of it like an actual web for a spider to crawl on. If parts of it are broken, the spider will have a difficult time getting from one place to another.
Most importantly, orphan pages represent missed opportunities to acquire and engage customers and can hurt your bounce rate. Fortunately, losing out on page traffic, retention, and revenue and hurting your SEO success because of orphan pages is something that can be easily remedied. Here is how you can use BrightEdge to cure your site of orphan pages.
How to find and resolve orphan site pages?
You can take an easy 5-step process to identify and address any orphan pages on your site:
- Get a full list of your current website pages
- Run a website crawl for pages with zero inbound internal links
- Analyze the audit results
- Resolve any orphan page found
- Rerun the audit periodically to catch new unlinked pages
Let's quickly dive into each of these steps.
1) Get a full list of your current website pages
Pointing your favorite website audit solution to your home page and expecting it to identify orphan site pages won't work because, by definition, orphan pages are not linked to from any domain page. The crawler will never find them. Instead, you need to specify the full list of site URLs that the crawler should examine. There are a few ways to get the URL list:
Use your sitemap file
The sitemap is a file that's typically placed at the root of your domain to help search engine bots understand the content of your site, how often you update it, and how to best surface your content on search engine results pages, or SERPs. When you add a new page or post in your Content Management System (CMS), your sitemap is dynamically updated, but make sure your sitemap contains the full list of your pages before using this technique.
Download a site URL list
If a sitemap is not an option - for example, if the sitemap does not contain the full page list - then you can generate the list from your CMS. On WordPress, for example, you can have install a lightweight plugin, such as List URLs to export a list of site URLs as a CSV file. You can also ask your IT to give you a copy of the CMS log that lists all the pages that were served to your visitors. Load the list into Excel and filter on unique URLs. Once you have the full list, paste it into your crawl configuration:
2) Run a website crawl for pages with zero inbound internal links
To identify orphan pages, set up the audit rule to catch pages that don't have at least one inbound internal link. While configuring the audit, set up a recurring crawl to catch any new unlinked pages in the future. Note that if you are relying on URL list then you'll want to get an update list from your CMS.
3) Analyze the audit results
Once the audit is complete, log back into ContentIQ and view the results of your audit. Identify the orphan pages and determine their objectives: Are they actively driving referral, paid or social campaign traffic? Do they have quality backlinks? Do site visitors access them extensively via onsite search?
Use your web analytics solution to assess traffic sources, visits and page views, entry, and exit behaviors. In the example below, we see an example of a campaign page that was helping acquire traffic for a given time. Once the campaign ended, the page no longer attracts traffic, and can be removed from the site:
4) Resolve any orphan page found
Once you understand what purpose the orphan page serves and how it aids in driving your website and marketing goals, you can determine what step if any to take with the page:
- Link to it from other internal pages if it's imperative for site visitors to find it via browsing
- Archive it if it's no longer needed
- Leave it as-is if it's serving a business need that doesn't require internal linking to the page
5) Rerun the audit periodically to catch new orphan pages
Since pages can become orphaned over time - by adding new content and forgetting to link to it, or by accidentally removing links to pages nested deep in the site structure - it is important to check the site periodically for new issues. As noted earlier, you can configure ContentIQ to periodically rerun your audit by scheduling a crawl: That's the end of our Quick Win recipe for identifying and addressing orphan pages on sites.