[TYPO3-dev] An idea to further process ' page not found ' 404 handling
SwagmanInternet
typo3 at swagmaninternet.com
Tue Apr 29 06:24:03 CEST 2008
Hi all,
Background.
On and off over the last 2-3 weeks I have been working to get my TYPO3
installs to properly process ' page not found ' 404 handling. I have read
the information on this page
http://typo3.org/development/articles/improved-404-handling/ and note that
requested pages and resources that do not exist are redirected to the
websites home page and do not respond with correct 404 http headers.
So I have come up with the below comments and suggestions from my research
and testing. Just in case I missed something in setting up 404 handling in
my T3 site(s) working I tested out the example incorrect URLs listed below
on a few random TYPO3 websites online and saw similar redirections to the
index page of those TYPO3 installs too.
If the below code suggestions are plausable perhaps they could be considered
for the core?
Your thoughts and comments appreciated.
Regards,
Matt
SwagmanInternet.co
--------------------------------------------------------------------
404 handling report & code suggestions in detail:
After following the set up guide mentioned here
http://typo3.org/development/articles/improved-404-handling/ and when using
default TYPO3 .htaccess file and with or without 'simulate static documents'
it appears that 404 pagenotfound_handling works only for requested files:
- that have .html as the file suffix
- and the file being called is requested from the root of a
typo3 site, i.e. www.domain.com.au/file.html
If 'file.html' exists, page is shown
If 'wrongfile.html' does not exist then correct 404 handling takes place
correctly, due to function '$this->checkAndSetAlias()'
Correct 404 page error handling appears to fail for the following requested
resources;
- www.domain.com.au/file.htm
- www.domain.com.au/file.pdf
- www.domain.com.au/folder/
- www.domain.com.au/folder/file.html
- www.domain.com.au/folder/file.pdf
-
www.domain.com.au/file.htm?&tx_yourextension_pi1[showUid]=171
-
www.domain.com.au/folder/file.html?&tx_yourextension_pi1[showUid]=171
If any of the above requested resources fail then the browser is redirected
to the home page of website, the root page, due to $this->id being 'false'
and then $this->id is set to '0' in function 'setIDfromArgV()'
--------------------------------------------------------------------
Suggested short-term workaround that appears to work before potential patch
to 'class.tslib_fe.php'
in .htaccess, modify to suit:
AFTER lines
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-l
COMMENT OUT
# RewriteRule .* index.php [L]
ADD/CHANGE TO
# If any file/dir/symlink does not exist, set id=000 , this
value not likely to exist in the databases page table
RewriteRule .* index.php?id=000 [L]
------------------------
What happens in 'class.tslib_fe.php' , (after above .htaccess hack applied),
when TYPO3 processes the page requested when you force an 'id=000' in the
.htaccess file
- firstly, the argument '000' is set to type 'string'
- now when $this->id is processed by function
checkAndSetAlias(), $this->pageNotFound is set to 4
the if condition inside function checkAndSetAlias(), " if
($this->id && !t3lib_div::testInt($this->id)) ", is false when id=000
- now that the var $this->pageNotFound has a value the following
function is called
$this->pageNotFoundAndExit($pNotFoundMsg[$this->pageNotFound])
which now now executes the function
$this->pageNotFoundHandler() and causes script to exit
- Noting: if the following file is requested,
'www.domain.com.au/file.html',
- $this->id is changed from 000 to the alias value in variable
$fI['file'] in function checkAlternativeIdMethods()
therefore alias formed by simulate static documents are still
checked if exists in databse
Testing scenarios for the short-term workaround.
- all the above requested resources URLs, (inc working & failed
URLs), applied in browser to test using short term fix to .htaccess file
- tested using typo3 src v4.1.5 and no simulate static
documents, i.e. www.domain.com.au/index.php?id=000
- tested using typo3 src v4.1.5 and using simulate static
documents, config.simulateStaticDocuments = 1
- have not tested with realurl extension installed
- this workaround works on typo3 src v4.1.5, potentially will
work on previous versions of TYPO3 source, though this will depend on what
changes, if any, exist between versions of 'class.tslib_fe.php'
-----------------------------------------------------------
Long term suggested solution for fixing 404 pagenotfound_handling:
After reviewing code mainly in file 'class.tslib_fe.php',
I suggest the following 2 code fixes/modifications in file
'class.tslib_fe.php'.
------------------------
NEW function call added inside function fetch_the_id(), and also a 5th key=>
value added to array $pNotFoundMsg
// Checks if $this->id is false &&
pageNotFound_handling enabled, if yes, then set $this->pageNotFound = 5
$this->checkAndSetPageNotFound();
if ($this->pageNotFound &&
$this->TYPO3_CONF_VARS['FE']['pageNotFound_handling']) {
$pNotFoundMsg = array(
1 => 'ID was not an
accessible page',
2 => 'Subsection was found
and not accessible',
3 => 'ID was outside the
domain',
4 => 'The requested page
alias does not exist',
5 => 'The requested page or
file resource does not exist'
);
-----------------------------------------------------------
ALSO NEW function, inserted possibly below function
ADMCMD_preview_postInit($previewConfig)
/**
* Checks if $this->id is false && pageNotFound_handling
enabled, if yes, then set $this->pageNotFound = 5
* When $this->pageNotFound set 5 the TYPO3 correctly redirects
pageNotFound requests to value set in config
$TYPO3_CONF_VARS['FE']['pageNotFound_handling']
* this should only run when a file/symlink/directory does not
exist and page was Redirected to index.php in .htaccess file
*
* @return void
* @access private // should this be set
private?
* @see fetch_the_id()
*/
function checkAndSetPageNotFound() {
if (!$this->id &&
$this->TYPO3_CONF_VARS['FE']['pageNotFound_handling'] &&
!$this->pageNotFound == 4) {
$this->pageNotFound = 5;
}
}
-----------------------------------------------------------
-----------------------------------------------------------
Note:
Don't forget to set html tag '<base href=...>', in your main ts templates
config; This should be set so that when/if a page is called with a folder in
its path and also page does not exist the browser is redirected to your
TYPO3 index page. If your file resources do not have html tag '<base
href=...>' or absolute paths set then this could cause the page resources,
i.e. css files & images, relative paths to break and cause additional
browser redirects to your TYPO3 sites index page. Overall potentially
causing an infinite loop. Now when you set 'config.baseURL' you should also
set 'prefixLocalAnchors', this stops href anchors '#' from looking like
www.domain.com.au/# and instead the anchor is prefixed with the 'page name'
of the current requested page. The below ts code also demonstrates how to
set 'config.baseURL' to work with https pages, i.e. when using extension
'https_enforcer'.
Code to add to main ts template here, (inserted above 'page = PAGE'
declaration).
------------------
# turn on simulate static documents
config.simulateStaticDocuments = 1
## Set <base href=...> , considers if website uses SSL/https pages
## remove single comments here when if website uses SSL/https pages
#[globalVar = TSFE:page|tx_httpsenforcer_force_secure = 1]
#config.baseURL = https://www.domain.com.au/
#[else]
config.baseURL = http://www.domain.com.au/
#[global]
config.prefixLocalAnchors = all
page = PAGE
-----------------------------------------------------------
More information about the TYPO3-dev
mailing list