URLs are important. Content should exist at one and only one URL. Each variation in content needs to have a unique URL. Django Text Variation tries to be as flexible as possible by using middleware to check configured patterns as they are requested.
A great resource for my research was this blog post regarding methods for handling multilingual websites. He defines nine ways of accomplishing language variations in content, many of which are not generic enough for use here.
First and foremost, query parameters are not recommended. and will not be supported.
Ultimately, there are three ways specify a unique variation:
You can use any combination of these methods to specify your variation.
The domain difference could be at the host level, such as dim1.example.com or at a different top-level domain such as example.es. Top-level domains are probably only useful for specifying language variations.
Note
You can only vary on one dimension within a domain. That means you can vary on language (en.example.com) or audience (kids.example.com), but not both (en.kids.example.com) within the domain.
A common variation, especially for language specification, is to have the variant prefix the path, such as www.example.com/en/ or www.example.com/es/. However, encoding the variant within the path (www.example.com/blog/en/ ) is also supported.
Note
Putting the variant within the path may make it difficult to re-create variation URLs of items. If the item’s URL is either prefixed or suffixed with dimension information, the variation URL is easily created.
Path parameters are defined by Section 3.3 of RFC 3986 as semicolon (;) and comma (,). One benefit of the semicolon is support within Python’s urlparse module. It is also possible to use a dot (.) delimiter if you wish, although it is less common and not recommended.
Domain variations are defined within each variant’s dictionary. Each variant for each dimension has several options, one of which is domain. If domain is not specified, the default domain is assumed.
The middleware checks if the requested path fits a defined set of regular expressions in TEXT_VARIATIONS['URL_REGEXES']. To make the regular expressions easier to define, there are two shortcuts available.
Assuming two dimensions: ‘language’ and ‘audience’ with variants ‘en’, ‘es’, ‘es-mx’, ‘fr’ and ‘cd’, ‘tn’, ‘ad’ respectively.
# | RegEx Pattern | Result |
---|---|---|
1 | '{language}/{path};{audience}' | '(?P<language>en|es|es-mx|fr)/(.*);(?P<audience>cd/tn/ad)' |
2 | '{path}/{language}/{path};{audience}' | '(.*)/(?P<language>en|es|es-mx|fr)/(.*);(?P<audience>cd/tn/ad)' |
3 | '{language}/{path}\.{audience}' | '(?P<language>en|es|es-mx|fr)/(.*)\.(?P<audience>cd/tn/ad)' |
The middleware modifies the request
See TextVariationMiddleware for more detailed information.