The way search engines index sites hasn’t changed substantially over the last three decades.
While it is important for web admins to consistently update their processes in line with best practices if they wish to maintain fast, comprehensive indexing, the underlying search engine technology has remained largely the same at a fundamental level.
But that may be about to change.
IndexNow, a protocol that allows for direct communication with search engines, has the potential to dramatically alter the search landscape.
Having control over your website is empowering, and taking a firmer grasp of how your content is indexed will lead to broader and faster visibility in search engines.
This post explains some of the latest developments in indexing and how fostering greater autonomy over content crawling is a net win for SEOs, webmasters, search engines and, ultimately, users. We’ll also outline the current indexing best practices.
What Role Will IndexNow Play in the Future of Search?
IndexNow is a hot topic in the SEO space. If you haven’t already encountered it, it is essentially a notification service that web admins use to alert search engines when they publish content on their sites (or delete old content).
Because the technology holds the potential to save significant amounts of resources for search engines (by reducing the need for ongoing “exploratory” crawling), many SEO experts consider it extremely likely that it will be universally adopted.
It also presents an attractive proposition to SEOs because it allows them to let search engines know about changes to their sites instantly. It can sometimes take weeks or months for new pages to be discovered with traditional crawling.
Microsoft and Yandex are leading the way when it comes to giving webmasters control over how and when their sites are crawled. IndexNow accounts for 7% of new sites indexed in Bing. Seznam.cz also adopted IndexNow in early 2022. Over 16 million websites use the protocol, and it is offered through major WordPress plugins like Yoast. Cloudflare and Duda have also integrated it into their services.
However, IndexNow will likely only become relevant in a significant way if Google chooses to support it. Google has confirmed that it is currently in an experimental phase. Until then, it will not be the kind of earth-shifting technology it could be. In our opinion, at least.
How to Use IndexNow
There are no downsides to using IndexNow on your site. This is particularly the case if you run a smaller site and indexing is a slow process. Big brands like LinkedIn and eBay already leverage and benefit from the protocol.
Follow the steps below to start using IndexNow:
- Create and host an API Key: You need to generate an API key to verify ownership of your site and any submitted URLs. You can use Bing’s API generation tool to do this. You must then host an encoded .txt file at your website's root directory containing the key.
- Submit URLs: Once you have set up verification for your site, you can submit an HTTP request to the following URL: https://<searchengine>/indexnow?url=url-changed&key=your-key
- Check Submission success: Verify the success of submitted URLs using Bing Webmaster Tools or your tracking software.
IndexNow has published extensive documentation. Alternatively, you use a WordPress plugin or content delivery network (CDN) that supports IndexNow.
Other Key Indexing Trends
The growing presence of IndexNow isn’t the only indexing trend worth talking about. Here are three more that SEOs should take note of:
- Programmatic SEO and AI-generated content: The rise of AI-generated content and programmatic SEO (the auto-creation of huge quantities of pages targeting individual or small sets of keywords) presents an issue for search engines in terms of crawling. If there are millions of auto-generated pages, they will be unable to crawl the web effectively. This perhaps explains why Google penalized AI-generated content with its latest update in awarding content written for humans.
- The continued presence of crawl bots: Crawl bots are not going anywhere. They will still remain key to the way search engines index websites. Google currently uses HTTP/2, which is a low-resource transfer protocol. It’s critical to understand how robots traverse your website to gather data. With this understanding, you can structure your site accordingly to streamline indexing.
- Wider availability of tools for checking indexing status: It’s becoming easier for SEOs to check the indexing status of their web pages and digital assets. One key example is the recently launched video indexing report in Google Search Console. This trend will likely continue as SEO software becomes more granular and advanced.
Indexing Best Practices Checklist
The process that search engines typically use to index content is relatively straightforward. After a webpage is published, search engines crawl the web pages and, assuming no policies are broken, index them in their database.
Here is a quick-fire, comprehensive checklist of indexing best practices:
- Set up IndexNow pings: There are various ways of alerting search engines about new and updated content via IndexNow. Consult the published documentation for an overview of these.
- Ensure your site is mobile friendly: Google operates a mobile-first indexing system, meaning that it prioritizes mobile sites when crawling.
- Noindex appropriate pages: If you have pages that you don’t wish to appear in search results, use the “noindex” tag. You are less likely to encounter errors in indexing reports (such as those in Search Console) if you noindex pages correctly.
- Remove misplaced noindex tags: You should run regular checks to identify pages on which noindex tags have been incorrectly placed.
- Monitor your pages’ index status with Search Console: Check that your pages are indexed in Google with Search Console. Google likely provides the bulk of your traffic.
- Submit sitemaps to major search engines using applicable webmaster tools: Create and submit sitemaps to search engines, especially if you have a large site.
- Create a robust internal linking architecture: Link internally to enable effective crawling. Using a logical, category-based, human-intelligible URL structure is also good practice.
- Eliminate duplicate content: Duplicate content slows crawling. If necessary, use software to run checks and remove unnecessary content. Use rel=canonical labels if you need to publish duplicate content.
- Account for stand-alone web pages: It’s important to link to stand-alone web pages, especially landing pages, that aren’t accessible from other areas of your site.
- Add a robot.txt file if necessary: A robot.txt file tells search engines which pages should be crawled. It is used when you want pages to remain indexed while restricting crawler activity that might overload your servers.
- Ensure metadata is correct: The addition of accurate metadata to web pages makes crawling more efficient. This includes meta titles and descriptions, robots, googlebot, SafeSearch and rel attributes.
- Optimize multi-language sites: Use location-based URLs or hreflang tags to specify which pages apply to different locations.
- Use Google crawl rate settings (if necessary): This won’t be an issue for the vast majority of web admins, but Google does afford web admins some degree of control over how it crawls their websites. Emergency crawl limits can be set in certain circumstances.
Using BrightEdge for Better Indexing
Indexing can be a real issue if you have large quantities of errors on your site, as it prohibits search from crawling, limiting your organic performance. BrightEdge ContentIQ takes identifying technical errors to the next level and has the power to support site crawls for millions of pages. It enables marketers to catch, prioritize, and fix website errors and issues at scale before they impact organic performance. The Recommendations engine in platform makes it easy to be alerted on these prioritized issues, and clear steps on resolving them.
Effective site indexing is a core part of search engine optimization, particularly for larger organizations for whom issues are more likely to arise.
Staying on top of changes and consistently updating your process according to best practices helps you ensure that all pages on your site are indexed and that you can effectively troubleshoot issues.
SEOs should also stay abreast of the evolution and growth of IndexNow. It is likely that Google will either adopt IndexNow or develop its own technology, in which case it will benefit you to adopt new protocols as early as possible.