AW: AW: [Typo3-UG Oesterreich] Probleme mit dem tt_news Modul

Tue Dec 30 11:13:10 CET 2003

Tilli, Franz,

At 10:06 30.12.2003, Franz P. Kratochvil wrote:
>Tilli Weissenberger wrote:
>>Aber wieso ändert mir der RTE alle <BR>s in ein </p><p> um?
>>Z.B:
>><p><b>Ueberschrift</b>:<br>
>>Ich habe bei einer Site neben den von Georg geposteten Angaben auch noch 
>>folgendes im Setup:
>#no wrapping of RTE lines
>tt_content.text.20.parseFunc {
>nonTypoTagStdWrap.encapsLines.nonWrappedTag >
>nonTypoTagStdWrap.encapsLines.wrapNonWrappedLines = <p>|</p>
>}
>(bin ich am richtigen Dampfer, Georg?)

Denke ich doch, und wenn das Ergebniss Deinen Erwartungen entspricht, sogar 
sicherlich;
Eine Warnung nur: das Entfernen der <P> Tags erlaubt dann mitunter keine 
Absatz-Formattierungen mehr im RTE.

Unten an zwei mittellange Abhandlungen zum Thema XHTML und customized Parsing;
insb. Sacha's Setup-Beispiel enthaelt einiges an Inspiration fuer 
zusaetzliche Experimente an langen Winterabenden;

hth, lg georg, guten rutsch Euch allen wuenschend

############################################################
# T3    TS      xhtml - HTML validation - configure output parsing
############################################################

Q:
Howto configure the output parsing

A.
# SETUP:
config.xhtml_cleaning = cached
# all = the content is always processed before it may be stored in cache.
# cached = only if the page is put into the cache,
# output = only the output code just before it's echoed out.

in your setup. This should clean out some of the problems you have. This is
what is written in the tsref:

xhtml_cleaning
     string
     Tries to clean up the output to make it XHTML compliant and a bit more.
THIS IS NOT COMPLETE YET, but a "pilot" to see if it makes sense anyways.
For now this is what is done:

What it does at this point:
- All tags (img,br,hr) is ended with "/>" - others?
- Lowercase for elements and attributes
- All attributes in quotes
- Add "alt" attribute to img-tags if it's not there already.

What it does NOT do (yet) according to XHTML specs.:
- Wellformedness: Nesting is NOT checked
- name/id attribute issue is not observed at this point.
- Certain nesting of elements not allowed. Most interesting, <PRE> cannot
contain img, big,small,sub,sup ...
- Wrapping scripts and style element contents in CDATA - or alternatively
they should have entitites converted.
- Setting charsets may put some special requirements on both XML
declaration/ meta-http-equiv. (C.9)
- UTF-8 encoding is in fact expected by XML!!
- stylesheet element and attribute names are NOT converted to lowercase
- ampersands (and entities in general I think) MUST be converted to an
entity reference! (&amps;). This may mean further conversion of non-tag
content before output to page. May be related to the charset issue as a
whole.
- Minimized values not allowed: Must do this: selected="selected"

Please see the class t3lib_parsehtml for details.
You can enable this function by the following values:

all = the content is always processed before it may be stored in cache.
cached = only if the page is put into the cache,
output = only the output code just before it's echoed out.

Gruss / Best regards,
Benjamin Fischer

A:
I've also added to class.tslib_pagegen.php line 429:
type="text/javascript"
and also to lines 1589, 1599 in class.tslib_fe.php:
language="javascript" type="text/javascript"
and now the validator says my pages are valid html :)

 > What it does at this point:
 > - All tags (img,br,hr) is ended with "/>" - others?

perhaps " />" would be better than "/>", to enhance backwards compatibility?
Simon Child

I uploaded a modified parsehtml.php which does the above mentioned changes.

You can download it here:
http://www.dog-sharing.de/class.t3lib_parsehtml.php.txt

The result looks like this:
http://validator.w3.org/check?uri=http%3A//luerken.unlimitedvision.de/

Hi Ben,

Did you change a lot in that file?
no, I just extended the functions that were already there a bit. At the 
time I modified it, I was just beginning with typo. It`s also not very 
clean. For example this line:
if (!strcmp($tagName,"script") && !isset($tagAttrib[0]["type"])) 
$tagAttrib[0]["type"]="text/javascript";       // Set type attribute for 
all script-tags

will add text/javasript to all scripts with a missing type attribute. But 
it wouldn`t recognize if you use eg. vbscript.

I love compliant code. So the combination with config.xhtml_cleaning = all and
your mod should give good results?

yes, my site passes the validator as you can see in the result the second 
link points to. Of course there are still a lot of things you`ll have to 
take care of. Like if you add a link to header in the bagend the result 
will be <a><h1>header</h1></a> or the <p><ul></p> stuff etc. But it is a 
first step and maybe it fits your needs.

Here`s the additional TS-code I use to clean up the code:

CONSTANTS:
###############################################
#enable/disable RTE conversion
content.RTE_compliant = 1
###############################################

SETUP:
###############################################
#code cleanup section
###############################################
###############################################
#generate xhtml-code
config.xhtml_cleaning = all
###############################################

###############################################
#disable target attribute in the sitemap content element
tt_content.menu.20.1.1.target =
###############################################

###############################################
#remove CSS-attributes for p and pre-tags
tt_content.text.20.parseFunc.nonTypoTagStdWrap.encapsLines.addAttributes {
   P.style=
   PRE.style=;
}
###############################################

###############################################
#no wrapping of RTE lines
tt_content.text.20.parseFunc {
  nonTypoTagStdWrap.encapsLines.nonWrappedTag >
  nonTypoTagStdWrap.encapsLines.wrapNonWrappedLines = <p>|</p>
}
###############################################

###############################################
#remove <br> after <h1>
lib.stdheader.10.stdWrap.wrap =
###############################################

###############################################
#Remove <p> tags around <li> items (try to ;-))
tt_content.bullets.20.3.split.1.wrap =
###############################################

###############################################
tt_content.menu.20.1.1.wrap =<p> | </p>
tt_content.menu.20.1.1.NO.allWrap = |  <br />
tt_content.menu.20.1.1.NO.ATagBeforeWrap = 0
tt_content.uploads.20.default.split.1.filelink.file.postCObject.wrap = <br />
tt_content.menu.20.1.1.NO.linkWrap = |
###############################################

###############################
# Einstellungen für Listen aus RTE
tt_content.text.20.parseFunc.tags.typolist.default.wrap = <ul 
class="list01"> | </ul>
tt_content.text.20.parseFunc.tags.typolist.default.split.1.wrap = <li > | </li>
tt_content.text.20.parseFunc.tags.typolist.1.fontTag = <ol class="list01"> 
| </ol>
tt_content.text.20.parseFunc.tags.typolist.1.split.1.wrap = <li > | </li>
###############################################

-- 
Ciao,

Sacha