[TYPO3-doc] DocBook: XML IDs + translations/version handling and file structure impact

François Suter fsu-lists at cobweb.ch
Thu Jun 16 17:05:26 CEST 2011


Hi all,

I had a very interesting and fruitful, 1.5-hour long chat with Tom 
yesterday evening. It helped express out loud some of the concepts I was 
stumbling on when trying to look at the larger picture. It all started 
with the management of id attributes, but it was really related to wider 
concerns.

> I would recommend to keep the translated books each in their own domains. In
> my opinion there is no advantage in incorporating translation into a English
> book. Or in other words: don't mix.

That is indeed the conclusion we came to. In the original file structure 
that I had proposed, translations were at the same "level" as the 
original language. As a reminder:

├── FLOW3
│   ├── Core
│   └── Packages
├── TYPO3v4
│   ├── Books
│   │   └── ExtbaseFluid
│   ├── CoreManuals
│   │   ├── doc_core_api
│   │   │   ├── de
│   │   │   ├── en
│   │   │   │   ├── chapter1.xml
│   │   │   │   ├── chapter2.xml
│   │   │   │   ├── images
│   │   │   │   ├── index.xml
│   │   │   │   └── preface.xml
│   │   │   ├── fr
│   │   │   └── ru

This won't do because we would have duplicate id attributes within the 
file structure we would be working on, since id's wouldn't change across 
languages.

The logical conclusion we came to is that the file structure will be 
separate for each language (although similar, of course). Hence 
something like:

├── de
├── en
│   ├── FLOW3
│   │   ├── Core
│   │   └── Packages
│   ├── TYPO3v4
│   │   ├── Books
│   │   │   └── ExtbaseFluid
│   │   ├── CoreManuals
│   │   │   ├── doc_core_api
│   │   │   │   ├── chapter1.xml
│   │   │   │   ├── chapter2.xml
│   │   │   │   ├── images
│   │   │   │   ├── index.xml
│   │   │   │   └── preface.xml

> Just to be clear: with the term "versions" you mean a new document which
> describes [a] software from version X to version X+1, right?
> In most cases, this is a sequential process: write for version X, release it,
> and create a tag in your favorite version control system. After it is
> released, start writing for the new version X+1 and the process is repeated.

Versioning (in the Version Control System sense) provided far more food 
for discussion. My initial worry was that if you have several versions 
of one document in the same file structure, there might be confusion 
when trying to cross-link. This triggered a general discussion about 
managing versions, in particular in the light of using Git (which we 
will probably do, since the community is largely using it/switching to it).

First of all, it must be noted that the version of the manual itself is 
a very-well defined concept in DocBook. There's a "productnumber" tag 
that can be used as part of the introductory "info" block. This is the 
version number that will be used - upon rendering - to tell one version 
of a manual from another, independently from whatever version numbering 
takes place in the VCS (i.e. Git).

The structure of the documentation repository raised some questions. The 
current structure [1] has one folder per manual, then branches and trunk 
below that. This allows each manual to evolve independently. This won't 
be possible with Git, as branching is done at repository-level and not 
at folder-level. Thinking about this led to the following 
remarks/conclusion:

- all the branches that we currently have are not that useful, because 
we never actually go back and fix mistakes in older versions of the 
manuals. We really use branches more like tags. This means we could 
probably live with the following scheme:

1) the master branch would represent the current and stable version of 
any given manual. A "new version" branch would exist for preparing a new 
version of a manual. When it's considered ready for release, the changes 
are merged into the master branch. Released versions are tagged, just in 
case we need to go back to that particular version.

2) since manuals evolve independently we need to have one Git repository 
for each manual. Each would have its master and "editing" branch. The 
main documentation Git repository would include all other Git 
repositories as submodules so that it's possible to check out the 
complete file structure. It would be kept in synch with the master 
branch of every submodule.

3) the main repository would be used to get all the files for rendering, 
at their latest stable version. Rendering could occur something like 
once per day, at least to start with (at a later point we could imagine 
having some way to trigger a rendering on demand). The version number of 
each manual would be taken from the DocBook tag mentioned above. If that 
version doesn't yet exist a new folder would be created in the rendering 
structure, so that older version of manuals are still available.

> I'm not 100% sure if I understood you correctly. What impact to you expect to
> have it on the editors?
> If you've separated each language as I've suggested before, there won't be any
> overlap. So there won't be any duplicate IDs.

That was mostly my worry, but it can be solved with the solutions 
outlined above.

> The rendering process, however, does not know of any other languages. If you
> translate an English book, it knows only of that book. This may or may not
> what you want.

This is fine. I suppose that the rendering process will also trigger 
entries in a database, so that we have a simple way of finding all 
possible manuals, their versions, their translations, etc. I have a few 
ideas about this, but I'm not going to write them down here to avoid 
derailing this thread ;-) The only implication on rendering is that we 
thought about dropping the notion of sets, as we would probably not use 
those "superstructure" indices anyway.

I hope I have managed to explain that as clearly as it is in my head now 
:-) Feedback welcome, as usual.

-- 

Francois Suter
Cobweb Development Sarl - http://www.cobweb.ch

[1] https://svn.typo3.org/TYPO3v4/Documentation/


More information about the TYPO3-project-documentation mailing list