Building an E-Commerce Website: Top 7 SEO Mistakes

Photo of author

(Newswire.net — October 5, 2022) — E-commerce platforms today like launchcart.com are incredibly powerful. But things can go very wrong if you don’t know what you’re doing in terms of SEO. Just because you (or your web design firm) know how to make things look right for users, and work properly in terms of the shopping cart doesn’t mean you/they are doing the right things to make your website rank well in Google and bring you lots of customers.

Those e-commerce platforms all have tradeoffs between locking things down so you can’t mess things up and giving you the flexibility to customize the pages and the navigation in the ways that you need.

Just because it doesn’t look broken to you doesn’t mean that Google is seeing what you hope they’re seeing.

So, without further ado, here are the top 7 SEO mistakes seen by SEO consultants:

1. Faceted Search and Rel Canonicals

Faceted search is incredibly helpful to the user. It’s the ability to take a set of products and filter by several characteristics. Let’s take a clothing website, for instance. Perhaps they have a product category page for kids’ sweaters, and in the sidebar menu, you have the ability to filter by up 10 brands, 10 colors, 10 sizes, 5 price ranges, and 3 materials.  That gives you 15,000 possible variations on that page!

This is all great for users—but if you’re not careful, you might be presenting Google with 15,000 different web pages to consider indexing…when it’s really just variants on a single page.

How it goes wrong: If those filters are all parameters on the URL, you could be in trouble. So, let’s say your e-commerce platform implements a page like that with a URL like:

https://example.com/kids/sweaters?brands=117-152-4-63&color=blue&size=11&price-low=20&price-high=24&material=cotton

You need to tell Google that the URL above is really the same as just https://example.com/kids/sweaters, and you do that by setting the rel canonical in the page to https://example.com/kids/sweaters.

According to SEO consultant Michael Cottam, some people have “solved” this by setting meta robots to “noindex” all versions of the page that have parameters. The problem with this approach is that you’re likely to get links to those variants from blogs, social media, etc., and by noindexing those variants, you’re throwing that link juice away.

If you’re filtering using client-side Javascript and not changing the URL as those filters change, you’re in good shape.

2. Overly Restrictive Robots.txt

Many people have tried to use robots.txt to control indexation. That’s not what it’s for. In fact, Google will often index pages, despite them being blocked in robots.txt. There’s even a report in Search Console that calls this out.

Robots.txt should only be used to manage your crawl bandwidth. Let’s say your pages each have 3 extra links on them: one for “share this,” one for “add to cart,” and one for “add to wishlist,” and each of those links takes a parameter that indicates the page being shared/added.

So, your e-commerce site with 100,000 product pages thus also has 300,000 other URLs that Google can see. Normally, you’d set meta robots “noindex” in those ancillary pages, and each time Google saw one of those links, it’d fetch that page, see the noindex, and move on.

But if you’re having issues with Google crawling all of your new/updated content often enough, you might block those URLs in robots.txt and cut the number of pages Google tries to fetch down to 1/4th.

It’s also common to use robots.txt to block various folders, like wp-content/plugins, thinking there are no pages in there that Google should waste time on. Problem is that those plugins probably have CSS stylesheets in that folder hierarchy. So, if you block those in robots.txt, Googlebot can’t render whatever those plugins are creating on your pages.

3. Getting Too Fancy with your JS Menus

JavaScript can be a terrific way to implement powerful, context-sensitive, and fast menus.  But your main menu is also how most of your key search landing pages get internal link juice.

When Googlebot initially downloads the HTML for a webpage, you want Googlebot to see the links to all of the pages in your main menu, so that they pass link juice to those key pages.

For example, say you recently revamped your site and went to a full JS menu that was created client-side, after page load complete, to help improve page load time metrics and Core Web Vitals scores.

The problem is, Googlebot doesn’t go past page load complete, and never sees those menu links, and as a result, PageRank isn’t passed to those key landing pages, and you could lose rankings and traffic.

4. Content-Obscuring Popups

Most people hate those annoying popups that scream at you to sign up for their newsletter as soon as you land on the site.  But somebody in that company’s marketing department gets measured on how much their newsletter signups are growing, so they’re fighting to have that popup there, on the first page the user lands on.

But when those popups are semi-obscure compared to the rest of the page, Google sees that as intrusive, and a bad user experience—and can sometimes interpret what’s in the popup as the main content for the page.

Sure, for users, that only happens on the first page they visit, but for Googlebot, every page it crawls is the first visit, so Googlebot sees the popup on every page.

5. Improperly Sizing/Compressing Images

Often, big images are one of the main reasons a page load slowly.  And because Page Experience (mobile usability + Core Web Vitals) is now a ranking factor, this can be a big deal.

Three problems I often see in this area:

  • Image dimensions are too large – let’s say the image is 2000×3000 pixels…but your page only needs it to be 500×750 pixels. The browser can resize it for you (just specify width and height attributes), but if you do that, you’re downloading 16x too much image data. Instead, resize the image to 500×750, and save it on your web server like that.
  • The image isn’t compressed – not all JPGs are created equal. Using image compression in something like Paintshop Pro or Resize Image, you can generally make a nice, sharp, compressed version of an image that’s 1000 pixels wide and well under 200KB. Without doing the compression, that image might be 1-2MB.
  • Using PNG when JPG will do – JPG images typically compress 5x smaller than the equivalent PNG image, so use JPG if you can.  The only time, you need to resort to using a PNG is when you need background transparency.

If you’re running a WordPress site, the Imagify plugin is great.

6. ADA Compliance and ALT Text

ALT text on your images is required for ADA compliance. That ALT text helps people with vision impairment use screen readers to tell them audibly what’s on the screen.   But not only is having that ALT text good for that relatively small percentage of your site visitors that have that sort of impairment, but not having that ALT text is very easy for a prosecutor to spot (e.g., using a crawler like Screaming Frog SEO Spider), and if you’re targeted, you could find yourself paying a $10K fine. See this Search Engine Land article on the cost of non-compliance.

7. Improperly Hiding Your Staging Site

A good web development agency or team will have a separate website for staging:  allowing people to see and test a new version of the website before they move that updated website to your live, production server. But there are good, bad, and terrible ways to do this. If Googlebot is able to discover and crawl that staging server’s webpages, now you’ve got a duplicate content problem.  And, once you do finally move those pages to production, guess which version Google might think is the original?

  • VERY GOOD: password-protect every page, so that Googlebot cannot see the content, links, etc.
  • GOOD: set meta robots “noindex” on every page on the staging server. Googlebot can still see the pages and their content but won’t index those pages. About the only problem here is that if you’re doing a crawl test on staging, you don’t have a way to detect which pages are going to have meta robots noindex once they’re in production.
  • BAD: block Googlebot in robots.txt. Why is that a problem?  See the section on Overly Restrictive Robots.txt above.
  • VERY BAD: don’t block Googlebot in any way—just hope Google never discovers them!