Skip to main content

This site requires you to update your browser. Your browsing experience maybe affected by not having the most up to date version.

We've moved the forum!

Please use forum.silverstripe.org for any new questions (announcement).
The forum archive will stick around, but will be read only.

You can also use our Slack channel or StackOverflow to ask for help.
Check out our community overview for more options to contribute.

CWP Open Developer Discussion /

Techincal discussion of SilverStripe use on the NZ Govt Common Web Platform.

Moderators: camfindlay, Ed, Sigurd, swaiba

Managing legacy PDFs in /assets/


Go to End


5 Posts   4739 Views

Avatar
Ven

Community Member, 3 Posts

17 January 2014 at 11:44am

Edited: 17/01/2014 11:45am

We have quite a few legacy--and some active--publications in PDF. I'm looking at ways to manage those when we move them to CWP SilverStripe.

-- I'd like to give our users short, consistent URLs.
-- I'd like to be able to set rules so we could have a fixed URL that always pointed to the latest example of a particular tagged type: in my case "the current Pharmaceutical Schedule".
-- I'd like to be able to pull up all the examples of given "types" of files in /assets/: all the examples of a particular newsletter, or notification letters, or position descriptions, for instance. Some of these would be for display on web pages, while others would be for use in admin tools, because...
-- I'd also like to ease the lives of our site administrators, who might be trying to locate that one position description they just loaded into the assets bucket.

Ideally, we'd be able to add meta-tags to our documents. I think, but I'm not sure, that the add-on SilverStripe DMS might do things that are roughly what I'm looking for. I don't know how that might interact with the CWP Basic Recipe.

Alternatively, I could create separate sub-directories of /assets/ to act as different "buckets" for our various files. Although this does seem easier to set up right now, I'm reluctant to make the URLs longer (often duplicating what they are: /assets/useful-newsletter/useful-newsletter-2014-01.pdf) and to commit ourselves to URL paths when I know that our products do change their names. We could manage that more elegantly with tagging.

Thoughts?

Avatar
camfindlay

Forum Moderator, 267 Posts

20 January 2014 at 1:36pm

From what I know about the DMS module, I think it can create a persistent link in which you can replace files. They aren't pretty links though. It might pay to install this along side your CWP recipe + custom code and test things out (hopefully you have unit tests you can run for your custom code?).

Other options might be to create a custom controller and routing to serve up files, then build a DataObject to hold your files and any meta data.
Some info here: http://doc.silverstripe.org/framework/en/topics/controller

Unsure if others have used this module for CWP yet so exactly how it plays nice with the current recipe is unknown (as DMS is not in the "supported code" list, but that doesn't mean you shouldn't use it, rather than if you do you need to test well).

I would however probably advise against the asset "bucket" technique you mention... Better off building some sort of model structure to hold your file and metadata which means you would also be able to expose it via the RestfulServer and offer an XML/JSON API into your document listing if others reference your data.

You could also use the DMS module and extend or improve it as part of your project and commit those back to open source :)

Hope that helps.

Avatar
candidasa

Community Member, 10 Posts

21 January 2014 at 6:04pm

I built the DMS module. The advantage it gives you is a polished user interface for managing files and (many/many) linking them to pages.

Adding arbitrary metadata was on our todo list, but we never got around to added it. It would be really great is someone picked that up and implemented it (there is an unused "tags" many_many relation sitting in the DMSDocument class, waiting for some enterprising developer to give it some love).

As Cam says the DMS isn't officially part of the Basic Recipe, but I see no reason why it wouldn't work. Using Solr for searching within documents is a bit of a pain to set up, however.

-- I'd like to give our users short, consistent URLs.
With the DMS you get URLs such as: www.website.co.nz/dmsdocument/8137

-- I'd like to be able to set rules so we could have a fixed URL that always pointed to the latest example of a particular tagged type: in my case "the current Pharmaceutical Schedule".
This doesn't come out-of-the-box with the DMS, but you could set up a custom RedirectorPage that redirects to a DMSDocument according to some search criteria. That shouldn't be difficult to do.

-- I'd like to be able to pull up all the examples of given "types" of files in /assets/: all the examples of a particular newsletter, or notification letters, or position descriptions, for instance. Some of these would be for display on web pages, while others would be for use in admin tools, because...
-- I'd also like to ease the lives of our site administrators, who might be trying to locate that one position description they just loaded into the assets bucket.
You can do this by extending the DMSDocument class and added an additional has_one relation to a list of categories. If you assign a particular category to each DMSDocument, then you pull up lists of each type into pages and admin sections. Or, you could use a many_many relation and then you have the tagging style functionality. Tagging should be too difficult actually. Basically I would just throw in the Tagfield module (https://github.com/chillu/silverstripe-tagfield) using composer and link it up the tags many_many.

I would advise against using the assets buckets. I've found the assets storage system breaks down as you add lots of files in a complex folder structure. You need to be very disciplined with that kind of approach, otherwise you end up with asset-chaos ;)

Good luck! Let us know how you go.

Avatar
Ven

Community Member, 3 Posts

21 January 2014 at 8:26pm

Thanks to both of you for your responses.

The fact that I'm doing a lift-and-shift of an existing site under time pressure means that I'm not going to be able to follow up the SilverStripe DMS idea for launch, but since the idea is continuous improvement/development after that, it's something that might come later.

The responses did make me realise, though, that there are other requirements. It sounds like those might add further levels of complexity.

-- URLs and file names must be human-readable, and must follow certain naming conventions.

File names will always be in a format along the lines of "schedule-2014-01.pdf" or "notification-2014-01-22-drug-name-scrabble.pdf". It must be possible to tell by looking at the file name what the file contains. If the naming convention is followed it should be rare to get a duplicate file, or to load a file a second time because you didn't realise it was already there. When our users save our files locally, the files should sort into order, and it should be clear what type of thing each file is. File names that are automatically-generated numbers aren't going to cut it, sorry. What options do we have within SilverStripe DMS for managing our quite intentional file names, and replacing/simplifying the directory name?

Setting up many-many tagging sounds like it'd be worth exploring. Is there any reason, if we hope to do that in the future, not to put our files loose in the /assets/directory now?

Second point: persistent links. We can set up a manual link fairly easily, that we update each time we load a new schedule or whatever. But we're looking to replace a system which does that automatically: you load a new file (with its own file name), tag it as a Schedule, and the link we hand out that always goes to the latest Schedule will then go to it without anyone having to do anything else. We'll survive without that, but it's a refinement that we've found useful.

Again, given that we're not going to have much functionality in our initial build--not even many of the things that are built into the Basic Recipe, but which clash with things of ours that we'll need some time to change--I'm interested in both ways to do things simply but effectively now, and ways to make sure we have options when we develop things later.

Avatar
candidasa

Community Member, 10 Posts

22 January 2014 at 11:10am

With time pressure and the need for human readable URLs you probably will have to just set up a file structure in assets. It won't be a long term solution, as it will degrade over time, so I would make sure to set the expectation with the site owner that extra work will be required after launch.

The DMS doesn't have the human readable URL feature that you mention. That is an good feature to have, mind you. Thank you for highlighting this as a requirement.

I think we could achieve it reasonably easy by adding the human readable component from the file title, and still only using the number for the looking. You'll notice Amazon uses this trick. Try these two URLs, only the stuff after dp/ matters:

http://www.amazon.com/SilverStripe-Complete-Guide-CMS-Development-ebook/dp/B003GY0K7A/
http://www.amazon.com/Ven-amazing-lift-and-shift-website/dp/B003GY0K7A/

The reason to keep a unique ID is that it is so much easier to manage. No folder structure to worry about, no duplicate detection, moving a file around is easy. Files can be in multiple locations at once. With the SilverStripe page tree we have to handle all of these edge-cases and it is very fragile. It took years to get right.

Automatic persistent links are also a nice idea. Nice point! It depends on the tags feature working. But again, you'd have to build it.

So, to your question of assets vs. DMS. I don't think you'll get around doing some rework. We once migrated a website with an assets system to the DMS. The migration script was a bit of work, so I'd recommend you keep as much metadata as possible with the files in your initial solution. So, either extend the File class and add extra database records to that, or encode information into the filename in some consistent way. The idea is to be able to use that information to automatically migrate the files later.

Good luck!