Did you know that Google can read non-HTML files such as PDFs, spreadsheets, and presentations?
And not just read them, by the way… it can index and rank them in the search engine results.
FUN FACT: Google currently has over a Billion of PDF files indexed!Click To Tweet
How does Google treat PDFs in the search results?
PDFs in Google search results
Here’s a quick Q&A about PDF indexing – straight from the horse’s mouth (slightly paraphrased):
Q: Can Google index any type of PDF file?
As long as any given PDF is textual (contains text), not password protected or encrypted, Google can read it and index it.
Q: How are links treated in PDFs?
Generally, links in PDF files are treated similarly to links in HTML: they can pass PageRank and other indexing signals, and Google may follow them after crawling the PDF file.
It’s currently not possible to “nofollow” links within a PDF document (not that you’d want to.)
Here’s a much more recent confirmation of the same from Google’s Gary Illyes who said in the comments section of this Google+ post that links within PDF documents do pass PageRank.
Q: Can PDF files rank highly in the search results?
Sure! They’ll generally rank similarly to other web pages.
Q: Is it considered duplicate content if I have content in both HTML and PDF?
Whenever possible, Google recommends serving a single copy of your content.
If this isn’t possible, make sure you indicate your preferred version by, for example, including the preferred URL in your Sitemap or by specifying the canonical version in the HTML or in the HTTP headers of the PDF resource.
Here’s an interesting read on converting PDFs into HTML content for better search engine rankings:
- Link Juice Hack for PDF Files – Dan Petrovic at dejanseo.com.au
SIDE NOTE on duplicate content
Duplicate content penalty does NOT exist. It’s a myth.
Duplicate content is more of a problem when content appears in more than one place (URL) on your site. Google isn’t a big fan of that.
However, if the same content appears on the Internet in more than one place (URL) – this happens with we syndicate the same content to different websites, Google will look for the one that it thinks is the original source and rank that URL and not the other copies.
That’s called ‘omitting duplicate results’, which is FAR from penalizing a site for it.
To learn more, read:
Q: How can I influence the title shown in search results for my PDF document?
Google uses two main elements to determine the title shown:
- the title metadata within the file, and
- the anchor text of links pointing to the PDF file.
To give your PDFs the best possible chance to rank, Google recommends paying attention to both.
What is metadata of a (PDF) document?
Document properties, also known as metadata, are details about a file that describe or identify it.
Document properties include details such as title, author name, subject, and keywords that identify the document’s topic or contents.
Here’s a screenshot to give you an idea where to look for document properties in a Word doc – before saving it as a PDF.
Heads-up: this screenshot is from Word for Mac 2015. It might look slightly different in your version of Word.
Soooo… What Are PDFs Good For?
Traffic Generation via PDFs
If I said you could drive hordes of website traffic with PDFs, I’d be lying. So I won’t.
In my opinion, there are two valid ways to drive traffic with PDFs:
- by sharing PDFs on social media (like you would any other piece of content);
- by uploading PDFs to document-sharing sites like SlideShare.
SlideShare is the only site that has ever sent me any actual web traffic from my PDF uploads.
As a matter of fact, SlideShare is my favorite non-conventional way to drive traffic – mostly with slide presentations though, not with PDFs per se.
To learn more about my SlideShare traffic strategy read:
Another realistic way to bring website traffic back to your blog via PDFs is by getting them listed in the search engines.
SEO Traffic via PDFs
Can a PDF rank highly in Google search engine ranking results?
Only recently I started noticing exactly how many of them do rank for the searches I regularly perform; I just never paid attention to them in the past.
Here’s an example:
As you can see, the very first result is a PDF.
What ranking signals does Google use to rank PDFs?
- Content itself (see the video below).
- Metadata within the file (discussed above).
- Links to PDF – just like with any piece of content, the more links you build TO your PDF files, the better chance they have to rank. Also, Google uses the anchor text in those links as a signal of what PDF is about.
Of course, there’s no reason why you can’t host PDFs on your own blog, build links to them, and get them ranked. Even better that way – bringing SEO traffic to your site vs a third-party platform.
Of course, there are times when you might not want your PDF files to rank on Google, like it would be the case with my exclusive traffic generation report I give away to my new email list subscribers – for obvious reasons.
To ensure your PDF doesn’t get picked up by Google, add a noindex tag to the page used to serve the file.
Link Building via PDFs
In my opinion, link building is the most realistic way to use PDFs to your advantage.
It works pretty much like syndicating any other type of content, but instead of submitting your content as is, you’ll need to convert it to PDF first.
Sounds like a hassle?
Not really, if you know how to do it right (quickly and efficiently.)
Just copy your article to a Word doc and save it as PDF; that easy.
PDF Directories List
As you can imagine, there are plenty of free directories to submit your PDFs to.
However, just because the directories are plentiful, it doesn’t mean you should go on a PDF dumping spree, i.e. submit the same PDF everywhere.
Here’s a good PDF directory round-up to get you started:
PDF over a Web Page?
Under what conditions would Google rank a PDF over a regular HTML web page?
It all comes down to how Google works and what it finds more relevant.
I’ll let Matt Cutts elaborate:
PDF Traffic Marketing Takeaway
And there you have it: one more potential traffic generation method, one more link building method, and one more thing to add to your already mile-long to-do list…
OR… you could look at it another way: repurposing your content as PDFs is a great way to reach a new audience with a different content format!
You should definitely learn more about content repurposing:
So… PDF traffic generation strategy might actually be worth exploring further, don’t you think? 😉
From Ana with