Prevent PDFs and other media be indexed by Search Engines

Hello,

My client links PDFs using the link field. In my code, I then link to the PDFs and open in a new tab.

Everything works fine, but the problem is that Google indexes those PDFs and shows them in the search results.

Is there an option in Prismic to prevent those PDFs from being indexed? (I am using NextJS by the way)

Here is an example of PDFs being indexed by Google.

https://www.google.com/search?client=opera&sxsrf=ALeKk02L1dgn1GXhi8CTVisj-Xd3Tt8IcA%3A1606728342288&ei=lrrEX8yfEcubgQb2jKHIAw&q=hivery+pdf+download&oq=hivery+pdf+download&gs_lcp=CgZwc3ktYWIQAzIFCCEQoAE6BwgjELADECc6BwgAEEcQsAM6CAghEBYQHRAeOgkIABDJAxAWEB46BggAEBYQHlDwIlixKWDAKmgAcAB4AIABiAGIAd0GkgEDOC4ymAEAoAEBqgEHZ3dzLXdpesgBCcABAQ&sclient=psy-ab&ved=0ahUKEwjMwt7A-antAhXLTcAKHXZGCDkQ4dUDCAw&uact=5

The first result is

Which is from the Prismic CDN, so I am not sure how I can prevent it from being indexed by my own website.

Any help will be greatly appreciated!

Cheers,
Kris

Hello, just wanted to check if anyone can advise me on this?

Regards,
Kris

Hey Kris!

Thanks for posting this question. Unfortunately, we don't have granular control over PDF indexing. As I understand, the only way to change that is by changing the HTTP headers for the file.

I can offer a few potential workarounds:

Third-party Hosted PDF
You could host your PDF somewhere like Google Drive or Scribd, where they allow for unlisted documents, and then link to the PDF in Prismic. You could even set up an Integration Field so that all of your PDFs are listed in the Prismic Writing Room.

Self-hosted PDF
You could host the file yourself and set the HTTP headers to prevent indexing. Like with third-party hosting, you could set up an Integration Field to index your PDFs. This option would be technically advanced.

Alternative File Formats
In my experience, Google aggressively ranks PDFs, because they are often high-value content. You could try another file format that Google doesn't index, like ePub or MOBI. You also might be able to use an infographic-style PNG, and block Google from indexing it. That way you could host the document in Prismic.

I hope one of these solutions might work for you! Let me know if you have any questions.

Sam

Thanks for the answer Sam.

Could you enable Integration fields for https://hiverywebsite.prismic.io and I will have a look :slight_smile:

Thanks,
Kris

Hey Kris,

IF is now activated on your repo.

Also, just a heads-up: we're just starting to roll out a new feature, the Integration Fields Write API, which allows you to post, update, and delete items in a an Integration Field catalogue. That feature is currently in testing.

Let me know how it goes and if you have any questions :slight_smile:

Sam

Hey Sam,

A follow up question - can you point me towards a way of integration Google Drive with integration field?

Hey Kris,

I don't have documentation specifically for a Google Drive integration, but we do have general documentation for connecting a custom API to Integration Fields:

Let me know if you have any questions about the process :slight_smile:

Sam

This issue has been closed due to inactivity. Flag to reopen.