Site building workflow challenges – keeping HTML in a database

I was reminded to today of one of my pet hates – coordinating a site build, or a site rebuild when the CMS you are using keeps content, often containing HTML markup from use of an editor such as tiny mce, in the database.

Consider the following scenario:-

  • You have a staging site where the client has been using the CMS to input content
  • Meanwhile, you make some changes to the database on your local version and want to push them to staging
  • You can use a migration script to push your changes to the live database but you find yourself wanting to also copy the new content back to your local database, so you can work on CSS on real content. You then would probably drop your local database and restore from a backup from staging, losing any test content you put in locally

It’s basically a bit of a kerfuffle.

This is one of the scenarios that I hope could be avoided with a CMS based on eatStatic (if I ever develop it beyond a blog engine) – any content-types that contain bodies of text, whether thay are marked up with HTML or not, would be stored on the filesystem. This could be put under version control, so you can selectively synchronise your content with another instance of the site.

I can also see a case for some add-on for any existing CMS – an export function that routinely pushes text content from the db into text files to be kept under version control, and also allows import, allowing instances to selectively sync content.

So what is it that I do, exactly?

different stuff that Rick Hurst does - freelance web developer, olivewood, kudos, lightplanet, foundry

If you ask any of my family or friends what it is that I do for a living, you will most likely get, at worst, a blank look, or, at best, maybe “web designer?”, “computer programmer?”, or maybe just “something to do with computers?”. It doesn’t help that my linked in profile has me claiming to be doing at least three different jobs at a time, and that I seemed to be involved with a number of different companies. I think possibly even those that work with me aren’t entirely sure what it is that I do most of the time! I thought i’d clarify what i’m up to at the moment, and how things have evolved over the last few years.

I am a freelance web developer

Primarily I make my living as a freelance web developer. This means that I spend most of my time building websites. By “building” a website I mean taking it from an idea to a finished website – planning, designing, templating, coding and uploading to a web server so that it is live on the internet. As well as web “sites”, I build web “applications”. These are computer programs that people interact with through a web browser. A web browser is the computer program that you use to look at websites e.g. internet explorer (the blue E), firefox (the fox wrapped around the world) or maybe safari (the compass). As well as looking at websites on the internet, you may be using a web browser to use web applications in your workplace.

But am I a designer?

I occasionally design websites, but it isn’t my specialism. If budget allows I prefer to hire a really good designer to come up with the design concepts. I know lots of good designers. I take their design concepts and run with it to put all the technical stuff in place to take it from an idea to an actual website. When I work for a design agency, I am hired always for my technical skills, never for my design skills – let’s face facts, I think there’s a reason for that! Most of the work on my “portfolio” wasn’t designed by me – I usually just did the technical bits.

Do I do any computer programming that isn’t related to web sites or web applications?

Increasingly more. I seem to be doing more and more data crunching these days, involving writing scripts that move files around on servers and extracting data from different places and putting things in databases. I would call that programming. Most of this type of programming is actually using skills I learned while building web sites and web applications, but some of the things i’ve built aren’t web sites or web applications at all.

So am I actually freelance, or do I work for a company?

By freelance, I mean that I work for myself, but companies and organisations employ me on a freelance basis. Therefore sometimes I may appear to be working for several companies at once. Sometimes I work directly for my own clients, other times transparently as an additional resource for a design or digital media agency. Either way, I invoice for the work I do on either a fixed price or a “time and materials” basis, and this is how I make a living.

What is Olivewood then?

In 2007 I co-formed a company called Olivewood Data Technologies Ltd, and all my freelance work is invoiced through Olivewood. Olivewood is co-owned by a client and friend, and we set it up primarily as a vehicle for consulting and development services, but also as a legal entity to own the IP for a number of niche eCommerce web applications that we plan to sell to other companies. At one point we thought Olivewood might be a digital media agency, but now we are pretty sure it is a software consultancy. [Update 2014 – Olivewood is now a supplier of high power LED lighting specialising in lighting for cold storage warehouses – how things change!]

And what about Foundry?

In May 2010 I sat on the watershed balcony with fellow freelancers Dan Fairs and Dan Hilton and talked about teaming up to be able to take on and pitch for projects bigger than we could handle as individual freelancers. We came up with the name Foundry, and shortly after collaborated on a successful project together. We all still work as freelancers, but hope to spend more time working together under the Foundry banner in 2011 and beyond. The challenge is moving the focus from looking after our own interests and incomes to working together, and to do that we need a big project that would keep us all too busy to take on other freelance work.

Will I fix your computer?

No I flippin’ wont! Have you tried switching it off and on again?

plone conference 2010 day 2

Plone conf day 2

Some notes and thoughts from day of plone conference 2010. These are mostly in note form, but wanted to get these online anyway, while they are fresh in my mind.

fixing the ungoogleable by Elizabeth Leddy

What happens when something breaks, and a simple google search doesn’t offer any results?

First, warn people (so they don’t bother you while you fix it, and they don’t panic).

Work out how quickly can you access your backups? I think this is very important – personally for smaller sites i’m doing individual site backups as packages mainly to make it convenient to restore them in a local dev environment, but I know some people are relying on whole server backups on tape. This would be laborious to restore in an emergency.

Isolation by elimination -> network, hardware, software, data
map out your system!

Eliminate things by switching them off – write your code so that it can handle dependencies being switched off.

Set up a system so that you have an isolated instance (so you can look at logs of only your own activity, rather than mixed with everyone elses)

Create a dashboard (maintenance page)

Monitoring

Write Test case’s to diagnose faults

Start writing a “help” email, but don’t send it. This is apparently known as “rubber ducking”. I do this all the time, I often spot my mistake when I past in a traceback to an email message to send to someone!

“horizon of intervention” – At what point do you need external help?

Get to know people with specialisms and buy them a beer! Participate in the plone community (I really need to do more of that).

Create tickets and close them as you go

Document your processes.

The state of plone caching by Ricardo Newbery

cachefu – useful tool, but as far as “internet time” goes, this is a little long in the tooth now (started 2006). End of life – critical bugfixes only

plone.app.caching – will be in future plone releases, but available now

load testing with funkload (not accurate regarding CSS/ image discovery)

load testing with multi-mechanize

BrowserMob – not free

Building a custom app with plone with minimal development by Eric BREHAULT

This was extremely interesting – Eric is the project manage for Plomino – a plone add-on that provides application development toolkit for creating database applications within plone – i.e. the types of database applications that non-technical people might create on there own machines with Lotus Domino, Filemaker or Access. The data is still stored in the ZODB, but models and forms can views can be created through the web. Formulas are done in python.

I can see this being really useful – all too often I end up building custom applications based on complex access database or excel spreadsheets i’ve been provided, but if I can persuade clients to use Plomino, then not only will it help get that data in a format suiteble for building a web application, but data can be collaborated on across teams, rather then emailing around (and inevitably forking spreadsheets and databases). There were some nice examples of data visualisation – a great quote “almost useless, but very nice”” – you need these to impress your boss.

Themeing with XDV (Diazo) Laurence Rowe

I wrote some notes on this yesterday, connected to Nates Deliverance talk, so won’t go into this here, other than to quote Laurence: “we write XSLT, so you don’t have to”

collective.amberjack: chapter one. The interactive age. Massimo Azzolini

Amberjack is a Javascript library for creating a site tour/ tutorial. Collective amberjack wraps this up as an add-on for plone. Interactive tutrials can be created.

An example was given using the windmill testing framework (windmill in itself looks nice alternative to selenium).

The Art of Integrating Plone with Webservices with David Glick

Most of this was over my head, but one important note I made from this is about urlib (which I use in a django screen scraper app i’m developing) can have two possible error responses – URLError and HTTPError – two possible error responses.

external ecommerce and plone playing along with Sasha Vincic

There seems to be a bit of a theme going here – an acknowledgement that Plone works out best long term if you use it as a “black box” CMS, and don’t try to do everything with it. The upgrade path is easier if you don’t add on your own customisations – “clean plone”. The current plone ecommerce offerings are not as good as external systems, so it is better to integrate with an external system, which is also then kept clean to allow easier upgrades.

In the python world there are some Django Ecommerce stores LFS and Satchmo, but the store doesn’t have to be Python – other proven systems such as magento can also be integrated with Plone.

To integrate with plone you need to integrate search, linking, thumbnailing. Valentine achieved search compatibility by creating objects in plone via an RSS import – see valentine.rssobjects. A latecomer to the talk asked “but which plone ecommerce product would you use if you had to?”. Answer “we wouldn’t”.

collective.transcode.star (lightning talk)

manage online transcoding
http://plone.org/products/collective.transcode.star – looks interesting, and may bring plone back into the picture for a project i’m doing.

BlueDynamics bda.plone.finder

osx style finder widget for navigating/ organising content in plone – looks great!

Plone Conference 2010 Day 1

quality schwag from Plone Conference 2010

Just back home after day one of Plone Conference 2010, with my mind buzzing so thought it would be a good time to write up some of my thoughts and notes. It was really difficult to choose between the talks on offer on the three different different tracks, but here are some thoughts on the ones I attended.

Keynote by Alexander Limi and Alan Runyan

Two main themes here – ubiquity/availaibility and designer friendliness.

To make Plone more mainstream it needs to be available to non-technical end users through the same means that other systems are already available – namely being able to deploy easily on cheap hosting, specifically the one-click installers on shared hosting in cPanel and similar. This would allow users to easily evaluate Plone for their needs in the same ay that they can already with wordpress, drupal and joomla – apparently there is a new joomla instance created about every two minutes. I must stop Alan Runyan and see if he has thought about microsoft web platform installer – nowadays this includes the option to install wordpress, drupal, modX, and load of other systems, including downloading and installing dependencies. It would be great if Plone was in that list.

One of the aims of Plone 5 is to make it more designer friendly. I think this is really important – even though since the release of plone 4 i’ve started using plone again for intranets and extranets (mainly straight out of the box with a few minor cosmetic tweaks), I currently still use something like wordpress, or a home-rolled CMS for website builds. That is now going to change – the theming story is being completely re-written by the introduction of Deliverance/XDV/Diazo (already available – more on that later), and Deco (TTW layout and content editing). The aim is to make Plone appeal to designers as something that helps, not hinders them.

Quote of the talk has to be from Limi – “Plone doesn’t suck, because the developers don’t hate the core technology” (or something like that) – in reference to the revelation that many drupal/wordpress/joomla developers admit they actually hate PHP, whereas Plone developers love python.

Deco: new editing interface for plone 5

The next talk I attended was Rob Gietema’s demo of Deco. This is looking really good, although i’m a little bit skeptical of drag and drop and in-place editing (I like front-end based editing, but prefer lightboxed modal editing to in-place), mainly because i’ve seem layouts explode and page elements disappear, or refuse to drop in the correct place on similar systems in the past. However, I haven’t actually tried this one yet, maybe i’m just clumsy! I think in general designers and content editors are going to love it.

LDAP and Active Directory integration

I attended Clayton Parker’s talk on LDAP and active directory integration – can’t say I absorbed much, but i’m sure i’ll be asked to do this one day, so it’s good to know that this is tried and tested and the tools are already there.

Easier and faster Plone theming with Deliverance and xdv

Nate Aune gave us an overview of Deliverance. I’ve known about Deliverance for ages, but the penny dropped for me today about how useful this is. The basic principle is this – deliverance acts as a proxy to transparently take HTML output from a website and merge it with HTML from a theme, according to a simple set of rules. In the case of plone, this means you can create a theme in static HTML and have content from a default theme Plone site displayed wrapped up in the static HTML. Simple rules can be applied e.g. “take the news portlet from the plone site, drop the header and footer and all the images and display in the element with and id of “recent-news” from my HTML theme. magic!

Nate quoted one example where the HTML theme is stored in a dropbox folder which the client has access to to make tweaks and changes. I can see front end developers and designers loving this.

There was much discussion at the end over which technology should be used for this – XDV is a fork of an earlier version of Deliverance, which has slightly different functionality. XDV, which is to be renamed Diazo, will be the theming engine for Plone 5. With that in mind, i’ll concentrate my efforts on Diazo. I’m excited by this for non-plone reasons – a majority of my works seems to involve integrating technologies that don’t belong together – this will really help.

Design and development with Dexterity and convention-over-configuration

Martin Aspelli gave a talk on dexterity – the (eventual) replacement for archetypes. This is already available, but not mature yet. The talk was mainly conceptual rather than code-led, focussing on best practice for designing your site or application – when it is suitable to create a content type, and when you might be better off creating a form, or using a relational database. Best quote “code is like a plastic bag” (reduce, reuse, recycle). Write less code.

Laying Pipe with Transmogrifier

Another talk from Clayton Parker – transmogrifier is a system to package up migrations of content from other systems. My thoughts on this were that it looked like hard work for a one off import (usually i’d write a one-off python script for something like this), but creating packages would benefit the plone community e.g. if there were packages available covering migrations from a standard wordpress, drupal or joomla, this would benefit plone. I suppose this could also be used to import content from older instances of Plone, where the upgrade path is broken.

Multilingual sites – caveats and tips

Sasha Vincic talkd about strategies and gotchas for multilingual site builds. Even though Plone has tools for this, there are common scenarios, such as the “missing page” scenario where a translation of a site may not have the same number of pages as the base translation. He also covered common issues such as escaped HTML being translated by third parties and being delivered content where HTML attributes have been translated, therefore breaking the HTML.

Guest Keynote: Challenging Business

This was an inspirational treat for the end of day one – Richard Noble is a fantastic speaker, and after a day of CMS talk it was great to hear his story of the challenges of his past world land speed record record achievements, and the current one – the Bloodhound SSC project. As well as building insane rocket powered cars (the current one has an F1 engine onboard just to drive the fuel pump!), his goal is to inspire children and young people to become engineers, as there is an impending massive shortage of engineers in training. I was also interest to hear that there will be no patents on the technology developed for the new car – the advancements will be made avaiolable for anyone in the engineering industry to build on – sound familiar?

archived comments

Thanks for the roundup Rick! Looking forward to the rest.

Mike 2010-10-27 21:29:18

Good, clear writeup. I could not make it to the conference this year. It’s nice to be able to follow it a bit this way. Thanks, Rick!

Maurits van Rees 2010-10-27 21:42:13

Thanks for the comments guys – videos for all the talks will eventually be online, so hopefully everyone who couldn’t make it won’t miss out 🙂

Rick 2010-10-27 21:58:26

Thanks for the post!

I agree with you re: transmogrifier which is (partially) why I wrote http://pypi.python.org/pypi/parse2plone

Keep the blogs coming 🙂

Alex Clark 2010-10-28 00:50:01

Tate Movie Project



Earlier on this year I was fortunate enough to be asked to help the Aardman Digital team out on the companion website for the Tate Movie Project . This was one of the most fun and technically challenging website builds i’ve worked on. Working as part of the team, along with several other Bristol freelancers, I helped integrate the cakePHP site with wordpress and vanilla Forums. This was also one of the largest site builds i’ve worked on – multiple flash developers, PHP developers, designers, animators, front end developers and producers, all coordinated by subversion, unfuddle and the biggest wall of printed out screen grabs i’ve ever seen!

Deploying and maintaining a website using svn

I use svn (subversion code version control) in a very simple way, but it has become an essential tool for how I build, deploy and manage websites. If you aren’t using some sort of source control, you should be. If you are, but you are using it only as a source code backup, you might be interested to see one of the ways it can be used as an integral part of the workflow for building, deploying and managing website updates. I’m not going to go into svn commands, this is more of an overview of the process.

This is my typical workflow, based on a php/mysql website build, but the process is relevant to most technologies:-

Create a new project in your svn repo
This basically consists of making a new directory in the root of the repo. At this stage, an experienced svn user might suggest you create sub-directories for “trunk” and “branches”, but I don’t bother (yet). I have my svn repo hosted on a web server so it can be accessed over https, by any machine with web access and an svn client installed.

Check it out to your local machine
I always develop on my local machine, I will jump through hoops to make sure I have a portable, standalone development environment for what i’m working on. It’s ok to develop on a remote server, as long as you have your own “sandbox” to work in, where you won’t be treading on anyone’s toes or vice versa. I always set up local hostnames and virtual hosts on my machine, e.g. myproject.macbook.local to allow me to easily work on multiple sites on the same port no. on my laptop. Check the project out and set the working copy folder as the root of your local website. When i’ve subcontracted work in the past, I have insisted that freelancers set up a local development environment and work with svn, even if we aren’t collaborating – there’s no way that i’m going to accept the work as a zip file over email! Working this way means that I can periodically check out the project to my machine to see how they are getting on, rather than having to ask them.

Build your site
As you add files to the site, add and commit them to svn. If you need to rename or delete a file, use svn to do it, to stop it getting out of sync. If you are collaborating with someone else, do an svn update and commit frequently (in that order). This way if you get any conflicts you can resolve them more easily than if you leave it till the end of the day. I find in that scenario it often becomes a “race” to commit stuff frequently, so that the other person is more likely to have to resolve conflicts if any turn up!

Contextual config file
I want to be able to deploy an identical code base to the live (and test) server environment, so I use some “if” statements that look at the host name to selectively load variables such as database config and anything else site/server specific, depending on which environment the site is running on.

Checkout the project on the live/ test server
Firstly, this scenario will only work for you if you have terminal/ ssh/ remote desktop access to your server. If you only have ftp or sftp available, this technique won’t work for you. If that is the case then you may have to resort to uploading the files individually or in bulk – which is a massive pain compared to being able to run an svn checkout/ export/ update on the live server.

There is also some debate about whether deploying or managing the live codebase as an svn working copy is a good idea. I think it is for the following reasons:-

  1. Occasionally you may need to debug something on the live/ test server. e.g. if something behaves differently to your local environment, or some other edge case. In this scenario if you fix the bug on the live server you can commit it back to the repo and do an update on your local copy.
  2. If you are managing user contributed uploads and want to keep them in svn. You can then log in to the server occasionally and add/ commit these back to the repo.
  3. Keeping an eye on things. A while back I spotted that an old wordpress site I was hosting had been hacked. Simply by doing an svn stat on the root of the site, I could see that some files had been modified – that’s an edge case, but it’s a useful tool for getting quick feedback if anything has changed on the live server that you weren’t expecting.

One thing you will want to do if you manage the live site as a working copy is make sure your web server is configured not to serve up the .svn folders, or any other folders not part of the live site.

Database changes
I always keep a mysql dump in the svn repo. This may or may not include data, depending on the size of the database, or the need to keep config data in the database. Once again this should be in a directory that is not served up by your webserver. For a first deployment, I will use a checkout of the mysql on the live server to create and populate the database. Thereafter, I use numbered migration scripts to manage any (structural) changes to the database, such as adding or altering tables. These are added/ commited to svn, then checked out on the live server to run and apply the changes to the live site.

Getting started with svn
If you are reading this and thinking “sounds good, but i’ve never used svn”. A good place to start would be to sign up for a free hosted svn repo with someone like beanstalk, and then try one of the many GUI svn clients. Rapid SVN is a good free one for most platforms, but there are tons out there. It will help to learn the basic commands, and be essential if you only have ssh access on your server. I said I wouldn’t go into commands here, but to put things in perspective, I get by with very few: svn co (checkout a project), svn add (add file(s) to project), svn commit (commit changes to file(s)), svn update (update the working copy with latest changes), svn resolved (mark a conflict as resolved), svn mv (rename a file and tell svn to sync on next commit), svn rm (delete a file and tell svn to sync on next commit) and finally svn revert (go back to the last revision, or a specified revision).

There’s more (but I don’t know it)
I’m fully aware that there are more sophisticated ways to use svn to manage and deploy projects, involving branches, merging, switching, externals and a whole load of other clever stuff that I have yet to become familiar with. However a simple project with checkouts/ commits and updates on local and live server works well for me, and is a good starting point to get stuck in and start to feel comfortable with it. Also, apparently all the cool kids have abandoned svn for git, which works in a similar, but different way, so that might be worth a look too, if you aren’t already familiar with svn.

You’ll never go back
I’ve been working this way for a few years now, but occasionally i’ll do some work for a company that don’t or won’t work use version control, or don’t use it as an integral part of the workflow (i.e. they just use it as a backup). Without version control as an integral part of the workflow, I have often had to resort to pen and paper to remember what files have changed, so I know what to upload, i’ve accidentally overwritten new code with old versions and mysteriously lost days of work when someone else does an update from their non-versioned working copy. The learning curve is worth it, and so is getting your head round how to resolve conflicts (usually the first thing that scares people away from svn).

archived comments

Nice post Rick.. I’m interested to find out a bit more about how you manage which files are served up by your webserver?

That’s the bit I’m a little lost on at the moment.. I use SVN a lot more now thanks to VersionsApp (http://versionsapp.com)

Nik 2009-06-04 10:20:09

I’ve been using SVN for a while now, but you’ve pointed out some uses/applications that should be pretty useful – thanks. I can recommend tortoise svn as a client.

I’m sure there is more that can be done with DB version control, but as you say there is a ton to learn.

Joel Hughes 2009-06-04 10:21:45

@Nik in my virtual host config I have something like the following:-

RedirectMatch 404 /\.svn(/|$)

(There should only be one backslash in front of .svn – need to update the blog software to do a stripslashes!
This stops svn stuff being served up. (I did write some other stuff to help with hiding other directories, but comment formatting messed it up, so removing!

@Joel +1 for tortoise svn on windows. I like the feature that you can commit a directory and it gives you the option to add any unadded files, and select which other files should/ shouldn’t be committed at the same time. A nice time saver

Rick 2009-06-04 10:58:16

Weighing in on the whole “working copy” debate (which you probably don’t want to get into!)… I really have never heard even a slightly valid argument for using an export rather than a checkout.

One thing I would say, though, is as well as denying access to .svn (a must), also make sure that your SVN credentials (passwords, and so forth) are NOT stored on the webserver. When you do an update/checkout, your SVN username and password will probably be stored (by default) in ~/.subversion/auth.

Instead, you can set the “store-passwords = no” and “store-auth-creds = no” in your ~/.subversion/config file. This is a bit of a chore as you have to type the password each time you update.

The alternative is to make sure the SVN user is read-only, but even that gives a hacker access to historical (and future) release info.

Tom 2009-06-04 11:10:11

Good post Rick. I’m going to point my SVN shy housemate at it later.

Not wanting to sound condescending, as I sometimes run my own personal projects as you describe i.e. just using the trunk, but when working on bigger, more complicated sites you should really considering creating a tag (just a svn cp of trunk) and deploying that so you can rollback in case of error.

Creating a SQL patch to the tag and one from the tag (i.e. one to add the new column and one to remove it helps when deploying sites that need to say up)

Andy Gale 2009-06-08 13:39:04

@Andy – yep absolutely – I just haven’t got round to trying it out so that I feel comfortable with it. Do you have a way of running sql patches automatically?

Rick 2009-06-08 13:55:35

Not yet… sometimes it counts which order the patches are run… but it would be fairly easy to automate it if you worked out a naming convention. That’s my plan for our next major project as it would help to auto-update each development environments as well the live site.

Andy Gale 2009-06-10 11:44:17