"Status" - filtering by status for Algolia indexing

We use Algolia for search in our application, and all of our content lives in Prismic.

We have a pipeline that successfully runs every night to re-index the site.

The problem is, archived items are still getting re-indexed in.

--> In Prismic, you can search through your content and filter by Status: Draft, Published, Release, Archived (see screenshot attached).

When I do a 'get all' of our content from Prismic into raw JSON data, I do not see any field where this correlates.
I'd like to add a line into my script for re-indexing to pull only items with the Status === 'Published'

Does anyone have any ideas or know where that "status" lives in the data or how to best accomplish this?

Thanks in advance.

Hi @jennifer.martelle,

Thanks for reaching out! As far as I'm aware, your repository's archived documents should not show up in the API response. Could you give me a couple of examples, either here or via DM if it's easier for you of documents you have archived that are still popping up when you query them through our API?

Hi, thanks for responding to this!
Currently, all archived documents are still showing up in our front end after nightly re-indexing.
in the script that runs nightly, it has:

const dataApiLibrary = dataApiLibraryRaw
  .filter(item.data.api_title?.trim() !== "")
  .map((item) => {
    // Initialize description
    let description = "";

...and so on.

i want to add in:
.filter(**item => item.data.status === 'Published' &&** item.data.api_title?.trim() !== "")
in hopes for it to only filter through Published documents, but I'm sure this is incorrect for the data structure, as when I look at the raw JSON of our Prismic content which is grabbed from a dangerouslyGetAll function, I do not see Status... any ideas?

The status is indeed not going to be in the API response, as the API simply doesn't return archived documents. One reason why this might be happening would be if you're not using the latest reference when querying the API.

For example, you have a ref ABC for your repository, to which all your published content is attached. If you archive a document, a new ref DEF will be created as a result. If you query the content at the end of the day with ABC, the archived content will still appear there as published. You need to make sure that you always use the latest ref you have to guarantee you're getting the freshest version of your repository.

Does that make sense?