Embedding Quotes in Forth Strings

2011-05-03

Background

Like most computing languages, Forth supports quote-delimited string literals e.g.

    : HELLO-WORLD  ." Hello world!" ;

    S" This is a string literal."

    ABORT" file not found"

Unlike many languages, there is no way to insert the delimiter character " itself into the above strings. This seemingly strange omission may be explained as follows:

To illustrate the last point, let's suppose we wish to display the following text.

    This string contains "quote" marks.

The standard functions ." S" cannot be used as the resulting string would consist only of those characters up to the first double-quote. The solution is to PARSE the string using as delimiter, a character not contained in the text e.g.

    CR CHAR | PARSE This string contains "quotes" marks.| TYPE CR

This produces the desired result. If many such strings need to be entered, the process can be simplified with a defining word.

    : S|  [CHAR] | PARSE  POSTPONE SLITERAL ; IMMEDIATE
    : TEST  CR S| This string contains "quotes" marks.| TYPE CR ;

A more complex variation is S\" . This function uses the backslash escape sequences popularized by the C language e.g.

    : TEST  S\" \n\This string contains \"quote\" marks.\n" TYPE ;

The Real Problem

The previous solutions may be termed "work-arounds". They offer a way around a defective or deficient function that does not involve change to the function itself. As a temporary solution, work-arounds pose little problem. When they become part of Forth, they create duplication and risk displacing the functions they were meant to augment. S\" is particularly problematic as it introduces a foreign and competing syntax into the Forth language.

To avoid these issues, an alternative approach to embedded quotes is presented - one that addresses "the real problem". It takes the standard functions ." S" ABORT" etc. and extends them in a manner compatible with standard behaviour.

Control Characters

Embedding of control characters in string literals is not supported. Control characters are inherently non-portable and Forth practice is to output them separately e.g. CR WRITE-LINE . Applications requiring strings with embedded control characters or C-style escapes (e.g. Windows) are properly supported through library functions rather than the Forth language.

Embedding Quotes

The embedded quote scheme employed is the same as that used in Fortran, Pascal and most assembly language. When it is desired to embed a double-quote character in a quote-delimited string, simply enter the character twice e.g.

	S" This string contains ""quote"" marks."

Implementation

A sample implementation is provided. A full implementation would include functions such as ABORT" C" which require system level support to implement.

The following code is public domain with acknowledgement to Wil Baden on whose code it was based.

255 CONSTANT bufmax  \ may be greater than 255 characters

CREATE buf  bufmax CHARS ALLOT

: +buf ( addr1 len1 len2 -- len3 )
  >R bufmax R@ - MIN ( clip) R>
  2DUP + >R CHARS buf + SWAP CMOVE R> ;

: PARSE" ( "ccc<">" -- a n )
  0 BEGIN
    >R  [CHAR] " PARSE  2DUP R> +buf >R
    1+ CHARS +  DUP SOURCE CHARS + U<
  WHILE
    DUP C@ [CHAR] " =
  WHILE
    1  DUP >IN +!  R> +buf
  REPEAT THEN  DROP buf R> ;

: S" ( "ccc<">" -- | a n )
  PARSE" STATE @ IF POSTPONE SLITERAL THEN ; IMMEDIATE

: ." ( -- )  POSTPONE S" POSTPONE TYPE ; IMMEDIATE

\ counted string support

: STRING, ( addr u -- )
  255 MIN HERE OVER 1+ CHARS ALLOT PLACE ;

: ," ( "ccc<">" -- )  PARSE" STRING, ;

\ : ."  POSTPONE (.") ," ; IMMEDIATE
\ : C"  POSTPONE (C") ," ; IMMEDIATE
\ : ABORT"  POSTPONE (ABORT") ," ; IMMEDIATE

CR .( Testing ... ) CR
CR S" This string includes ""quote"" marks" TYPE
: test1  CR S" This string includes ""quote"" marks" TYPE ; test1
: test2  CR ." This string includes ""quote"" marks" ; test2
HERE ," This string includes ""quote"" marks"  COUNT CR TYPE
Top    Home    Forth

em.gif (457 bytes)


Page updated: 03 May 2011