Has your original web content been mysteriously appearing on other sites? Has something you’ve written recently appeared word-for-word, without attribution or back links, on a web site other than your own?
Web content theft, whether done manually or by automated tools also known as scrapers, has been a rampant practice on the Internet for years. The bad news is there is no sure fire way of putting a stop to content theft. Once content is viewable on the Internet, there is always a way to obtain it. But, according to Scott Wilson, search engine optimisation (SEO) specialist and CEO of Burlington, Ontario-based Rankhigher.ca, there are a number of strategies that one can employ to make sure your content maintain an upper hand (search ranking-wise) over the copied content.
Wilson, who’s specialisation includes video production, optimisation of Google Places and search marketing in Google AdWords said “organisations bent on stealing content will always find a way to do so.”
The odds are heavily stacked against the original owner of the content because apart from competitors, there are tens of thousands of sites on the Internet that exist to steal content.
“The stealing is done to populate MFAs (made for ads) Web sites. These are sites that make money by serving up online ads,” he said.
“The MFA site owners earn money from clicks the ads get. In order to attract viewers, the site owners fill the site with low quality content that ride on popular topics of content that is stolen from other sites either through manual copy-and paste methods or automated scrapers,” Wilson explained.
How to uncover content snatchers
There are offices that handle cases of content theft and legal action can be brought against content snatchers.
There are numerous cases of organisations going after content thieves or online plagiarists, but these are typically large corporations that are out to protect their multi-million dollar brand or intellectual property. “Very often,” Wilson said, “SMEs are prepared to undertake what could be a protracted legal battle.”
“A small business operator has to determine if it is worth the time, effort and money to track down the offender and bring them to court,” Wilson said.
On the part of its site dealing with Digital Millennium Copyright Act, Google said when it receives notice of alleged copyright infringement the search engine’s actions could include, “removing or disabling access to material claimed to be the subject of infringing activity and/ or terminating subscribers.”
If Google does remove or disable access in response to such a notice, Google makes “a good-faith attempt” to contact the owner or administrator of the affected site or content so that they may make a counter notification.
Google also warned parties that when they file a complaint with them “they will be liable for damages (including costs and lawyers’ fees), if you materially misrepresent a product or activity infringing your copyrights.”
Wilson also suggested that businesses can tackle the problem by concentrating on strengthening their original content’s search optimisation properties.
“The first step”, Wilson said, “is to determine if your content is being stolen”. You can do a search on Google and other search engines using the keywords that you think best describes your text or image content. However a faster way is to use online tools.
Wilson recommends using Copyscrape, a free tool which helps users identify who is publishing and republishing Web site content.
“We frequently hire freelance writers and we use Copyscrape to make sure that they are not using duplicate or plagiarised content. Copyscrape is the best site I know for this purpose. It is used by many Webmasters, businesses and social media experts,” he said.
Make sure Google knows you’re the original author as Google’s search algorithms are geared towards rewarding original content authors and creators of high value content by giving them higher search rankings.
Taking this into account, Wilson said, “A business owner can protect their site’s content by making sure the search engine identifies your site as the original source of the content.”
Site owners should keep three things in mind, the first of which is making sure Google can find your site, then making sure that Google trusts your site and finally, making sure that your content is focused.
According to Wilson, “Google web crawlers are not very good with identifying f lash content, unfortunately many web designers use f lash when they want to create special features and effects for a site.”
“Don’t get me wrong. I like f lash and I think it’s a great tool. But if you want your site to be easily found by Google, you can also use open sourced tools such as WordPress, Joomla! Or Drupal, which Google can easily read,” he said.
“When there are several sites that appear to have the same content, Google typically gives the site where the content appeared earlier a higher ranking because this is probably the original source,” said Wilson.
“One way sites can establish content seniority,” he said, “is to avoid changing the URL of a site’s content. The URL has an association indicating when the content was posted online. Any change to that URL will put that date back to zero.”
“If URLs need to be altered, a 301 redirect should be added to it to ensure that searchers are redirected to the original URL. This way the content isn’t viewed as new,” said Wilson.
Some business owners might also want to rethink the situation, according to Wilson. “Having your content appear on another site might not be a totally bad thing.” This could turn out to be a positive situation, if the content is properly attributed to you or your site and could mean additional exposure that you do not need to pay for.
Wilson said, “Make sure the copied content mentions your name or your business name as the source. Also have the site featuring your content to link back to your site, this way you could get some of its readers to come back to your site.” Links could be established directly by having the other site post your name and site address with a link to it or indirectly via hyperlinks on the text or image being used.