XE3 RTL Changes: A closer look at TStringHelper

Posted by on in Blogs
For those adventurous types who like to get down and dirty spelunking in the RTL source, you have undoubtedly noticed that there have been quite a few changes made to the runtime code. If you aren't the type to immediately diff the latest source code to the last release to see all the new "goodies", I'll try to provide a quick highlight of some of the more significant changes you might find.

One of the most notable changes in the RTL is the addition of record helpers for simple types. Rodrigo Ruz did a nice job covering them in a recent blog post which you may want to read. There are several useful things that that record helpers provide:

  • Code completion when typing a dot after a variable name

  • Encapsulation of common operations that can be performed on the type

  • More conscience and intuitive naming for operations

TStringHelper is definitely the most significant of all the new record helpers. It's also unique in that it offers another benefit in addition to the ones mentioned above: helping you to prepare for a brave new world that is coming.

If you look at the declaration of the helper in System.SysUtils.pas you'll see that is has dozens of methods and properties. One of these, whose usefulness may not be immediately obvious, is the "Char" property. If you read the post from Rodrigo, you'll know that like all of the other properties on the string helper, this one is zero based. This is going to be important when writing code which needs to work regardless of the current state of the compiler directive which controls zero based strings. This property, unlike the more common "string[index]" syntax, always reads into the string data starting with a zero based index. Here is a quick code snippet to demonstrate the difference:

[caption id="attachment_24845" align="aligncenter" width="298" caption="Zero based string sample"]Zero based string sample[/caption]

As you can see the code using the Chars property doesn't change whereas the "normal" string indexing does.

The XE3 RTL source code has been refactored to be string index base agnostic. In most cases this is done by utilizing string helper functions which are always zero based. When it is necessary to traverse a string, the Char[] property is often used to access the individual characters without concern for the current state of the compiler with respect to zero based strings. In addition, the "Low" and "High" standard functions can now be passed a string variable to provide further flexibility as needed. When zero based strings are enabled, Low(string) will return 0,  otherwise it will return 1. Likewise, High() returns a bounds adjusted length variation. One important "gotcha" to watch out for here is that you don't want to use Low() and High() together with the Chars[] property since the values they return change and what Chars[] expects does not.

Hopefully, you've inferred from what I've covered so far that zero based strings are something new in the compiler. There really isn't much more to say about them other than: they are either ON or OFF and they change how the compiler interprets the value inside the brackets that follow a string variable. In addition to the directive you see in the source example above, you can also control this using the --zero-based-strings[+|-] switch to the command line compiler. Obviously zero based strings are disabled by default in the current desktop compilers, but there is always the possibility that could change in some possible future.

I had hoped to cover a few more of the changes in this post but it's already getting a bit long so I'll be back soon with more.


  • Page :
  • 1

Check out more tips and tricks in this development video: