[TYPO3-core] RFC #15457: Add support for prepared queries

Sun Aug 15 09:54:31 CEST 2010

Hi Steffen,

> fantastic, i love to see that!

Thx.

> Here are some annotations from reading:
>
> is there a reason why you used 5 and 10 for BOOL and AUTOTYPE?

5 is the internal value when using PDO, AUTOTYPE=10 is arbitrary (it 
does not exist in PDO) to have "space for other types". But as it's more 
or less "internal" and should only be accessed through constants, we may 
have arbitrary consecutive values as well.

> i don't like the naming of stmt_execute (too cryptic abbreviation). Why
> not preparedQuery_execute, or like the others exec_PREPAREDquery

Taken from MySQLi (http://ch.php.net/manual/en/mysqli-stmt.execute.php) 
but exec_PREPAREDquery is OK I guess.

> * $key = is_numeric($parameter) ? intval($parameter) - 1 : $parameter;
> why -1 ? why not use (string) $parameter always?

string is for named parameters, integer is for indexed-based parameters, 
thus for '?'. This is the same in PDO.

"-1" because when using ->bindValue(), you have to give a 1-base index 
as first argument. This comes from PDO and I'd like to keep it like that 
for possible forward compatibility if we are able to use PDO in the 
future. But as internally arrays are 0-based, I make the conversion at 
setter time.

> why choosing "thaw"? I looked up the translation but can't associate
> what is meant.

:-) that was one of the point where I did not know which verb to choose. 
I took the naming from Extbase code where the data mapper "thaws" the 
property values onto an object. I understood it as some synonym of 
"merge" if you look a translation.

> what happen ig thawValuesInQuery is called without PARAM_AUTOTYPE and
> wrong type?

Don't understand what you mean. The test for PARAM_AUTOTYPE is needed 
because of bindParam() where the type as well as the value of the 
parameter is determined at execution time, not at binding time. Will add 
a comment for that in code. This method is internal anyway meaning the 
"world" cannot use it without relying on the official API which makes 
sure all is properly defined.

But you're right, I'll add constraints as InvalidArgumentException if 
the values do not match authorized values (unknown type or obviously 
wrong information with a string defined as INTEGER).

> $stmt - see above, better naming

OK.

> if ($parameterReferences == NULL) {
> use strict here

OK.

> protected function &getValueOrParameter(...
> why is the reference here? Could you add this to the description?

You need the & in order to have PHP return a reference to the value it 
created inside, otherwise you only get its value (but have it as 
reference inside parameterReferences) and you cannot rely on putting the 
real value inside parameterReferences to have it updated in all parts of 
the where clause where it appears.

> I also don't know the parser very well, so can you explain how
> parameterReferences are filled? Maybe related to last mentioned function?

As explained briefly before, the point of having this global array is to 
have a quick access to all parameters used in query.

To make it brief while explaining how the SQL parser works, let's take a 
simple example:

SELECT uid, pages.title FROM pages WHERE pid = :pid

The SQL parser (t3lib_sqlparser) will parse this into a structure 
similar to this one:

$components = array(
	'type'   => 'SELECT'
	'SELECT' => array(
		array(
			// ...
			'field' => 'uid',
			// ...
		),
		array(
			// ...
			'table' => 'pages',
			'field' => 'title',
			// ...
		),
	),
	'FROM' => array(
		// ...
		'pages',
		// ...
	),
	'WHERE' => array(
		array(
			// ...
			'pid' => array(
				array(
					0 => ':pid',
					1 => '',
				),
			),
			// ...
		),
	),
);

I did not dump a real components array, but this is basically how it 
looks like.

When you compile it again, SQL parser generates the query back, as a string.

Now the point is that it would be dumb to loose the information of where 
the parameters where parsed and it would be counter-productive to take 
the whole component tree and dig into it at runtime to replace 
placeholders with real values to be used for the current execution. This 
is why I added a "parameters" structure at level 0 of the components 
array to reference all parameters. Previous query looks more like that 
actually:

	'WHERE' => array(
		array(
			// ...
			'pid' => array(
				---> pointer to parameters[':pid']
			),
			// ...
		),
	),
	'parameters' => array(
		':pid' => array(
			0 => ':pid',
			1 => '',
		),
	),

This way, I'm able to replace all :pid placeholders which a single 
statement:

$components['parameters'][':pid'][0] = 52;

and when compiled again, query will be:

SELECT uid, pages.title FROM pages WHERE pid = 52

When using DBAL, I use an intermediate structure to prevent having to 
compile the whole query again and again at each execution but still have 
a way of properly know where the parameters are. For instance without 
the parser, how could you know that this query only contains one 
question mark parameter?

SELECT * FROM pages WHERE pid = ? AND title = 'How to?'

> For testing i would like to see unit tests using the prepared
> statements, would make it much easier ;)

Some unit tests are available but I wrote them in DBAL package as I 
already did before. Have a look at DBAL patch in BT. I wrote all my unit 
tests in DBAL as of today because SQL parser was only used when using 
DBAL and the problems did not appear without it. As prepared queries 
will start being used in Core after this RFC, I think it would be wise 
to have prepared query-related unit tests in Core, as usual.

> Thanks for your great work and sry for the nitpicking.

NP. I like to get feedback because it allows improving what gets "in" :-)

I'll wait a bit for other comments before proposing another version with 
your "nitpicking" taken into account.

Xavier