We're migrating hundreds of documents from our existing site. Each document has at least one image, usually two or three. Having the docs go into a release works great for reviewing them before we ok them for the new site. However, it took a couple attempts to get the import right, and at that point I realized that each time we imported a zip of jsons, a new copy of the media files was loaded into the media area. We now have someone manually removing the duplicates. There are several different features that would either avoid this problem or make it easier for us to deal with it, including:
- Use the image filename as a unique id and therefore treat subsequent imports as an update rather than a new media asset.
- Treat the media assets as part of the release - must be published/approved before are adding to the media library.
- Add a tool for finding and eliminating duplicate media files through the UI.
- Add API methods to query for media assets and to delete them programmatically.
- Improve the documentation to make it clear that duplicate media assets will be created in this scenario.
- Improve the documentation to clarify what fields are used to establish identity for documents and media assets and how that plays out in linking, exporting, and importing, including as it relates to updating content vs creating new content items.
We'd find all these really useful, but any one of them would help us solve the problem we have right now or help the next person to avoid our mistake. If we had to pick one, we'd choose the API methods, because we'll end up wanting to continue to enrich our media assets with more metadata over time and that would make it a lot easier.