First and foremost, when you create a new website, or a new blog for your business – the first thing you care about is whether people are finding it or not. One of the first ways they’ll find it, is through search engines. Typically, you have to wait for Google’s crawler to visit your website and then add it to their index. So the question is, how can you see what pages are indexed, and moreover, how can you improve variables that make it so Google will crawl quicker. Here are some basics so you can understand how to know if your content is being indexed, and some great ways to ensure Google bot is crawling your website/blog.
The Googlebot is the search crawler Google sends to collect info about articles on the web – and to then add it to Google’s searchable index. Crawling, is the actual process where Googlebot is going around the web and finding new information. Googlebot follows links, from one website to another, in order to find new things to crawl. Indexing, is the process in which information is gathered by Googlebot from what it crawls. Once the article it crawls is processed, they are added to the searchable index if it’s determined to be of high quality. When the article is being indexed, Googlebot processes the words in the article. Things like title tags, alt tags, etc, are looked at in order to help Google understand what the article/content is about.
Googlebot finds new content by looking on the web at places like blogs, pages, press releases, etc, in order to find links. It crawls the web pages, and then goes to the destinations of the links in order to find new places to crawl. It also look at website sitemaps in order to find a list of destination subpages to crawl.
Here are some great ways for your new content to be discovered by Googlebot.
1. Create a sitemap – Sitemaps are XML documents on your website that list every single page in your website. It is updated frequently, and tells search engines what new pages have been added, and it’s something that is helpful in order to promote regular indexing of new content by crawlers. For example, if you website is built on WordPress, then you can install numerous plugins in order to have the sitemap automatically created and updated.
2. Submit the sitemap to Webmaster tools – After you create your sitemap, the next thing you should do is submit it to Google webmaster tools. If you don’t have one already, create a free Google webmaster tools account, and add your website to webmaster tools. After you add your website to webmaster tools, you can go to the sitemaps option and add a link to your website’s sitemap to Webmaster tools. This will tell Google to crawl your sitemap, and the pages listed in it.
3. Create social network profiles – Crawlers get to your website through links on other websites. One way you can get your content discovered is by creating social network profiles, and then adding links on those profiles to your content. Examples of profiles are: twitter profiles, facebook pages, google+ pages, etc.
4. Create content offsite – Remembers, crawlers look at content off-site, and look at embedded links in it. One great method of getting both links, and getting Google to index your content, is to create offsite content, such a guest blog post to a website in your niche, and then embedding a link in it back to your content. Please note, you don’t want to create blackhat content, and engage in spammy link building techniques. This is against Googles guidelines.
It’s actually easy. First, you can go to Google webmaster tools and look there. You can see inside Google webmaster tools the amount of pages indexed. This is a very simple way. Another way of understanding how many pages you have indexed is by doing the following.
1. Go to google.com
2. Type in site:domain.com
This will show you a list of all the pages indexed by Google. You can then scroll through all the content indexed.
Another cool thing you can do is type the exact url, in quotation marks, and do a Google search for it. For example, if you wanted to see if this page is indexed by Google, you would type in “https://www.seocompany.ca/how-can-i-see-what-pages-are-indexed-in-google/” – into Google, and then if it’s indexed, Google will show it as a search result.
Google’s index count is a good indicator of how much content you have indexed. Typically, Google will remove low quality content from it’s index, in order to reduce pollution. Sometimes, Google’s overall # of pages indexed number might change and fluctuate. If you see a 1% to 5% fluctuation in terms of overall # of pages indexed, that’s pretty normal. Things are in flux, and some of your outdated articles that are lower quality may get tossed out of the index, etc. Sometimes, Google’s # of indexed pages may fluctuate for no practical reason at all.
If the # of indexed pages drops drastically, meaning over 10% drop, then that’s a potential issue. Here are some reasons why the # of indexed pages might go down
1) You de-indexed them via robots.txt file. This is a very common mistake. Sometimes, you may inadvertently de-index your website by setting certain sections of your website as no-index, or no-follow, and therefore cause issues
2) You may be in a Panda penalty, or have a filter applied on some of your content. If you see content being de-indexed, you should look at the quality of the content. That means, look to see if it’s thin, or low quality. For example, in the past we’ve seen websites get penalized heavily because they have 400-500 pages of content, where each article is maybe 300-400 words long. In theory, it sounds fantastic having so much content, but when the length of the article is so thin – Google will actually think you’re trying to game the system. If you have too much content like that, you can trigger a Panda penalty – which results in your entire website getting penalized.
Discovery Our SEO firm begins every single campaign by immersing ourselves into the client, and their past history. We try to uncover as much as possible about your marketing campaign, and your company. Our goal is to understand your existing situation, and try to determine why your campaign has been […]
There are a number of methods possible to redirect pages. Typically, the choice of method will depend heavily on your usage, and what you’re trying to accomplish. For example, there are temporary redirects and permanent redirect. Each of these redirects have a usage case, and can be implemented through a […]
First and foremost, when you create a new website, or a new blog for your business – the first thing you care about is whether people are finding it or not. One of the first ways they’ll find it, is through search engines. Typically, you have to wait for Google’s […]