Krdytkyu

Question

I have read the question in TeX.SE, but I don't want user to add ^^J manually. That is, I want writer to output content originally.

documentclass{article}

begin{document}

newwritefile

immediateopenoutfile=tmp.txt

immediatewritefile{

To be or not to be,

that is % the question

}

closeoutfile

end{document}

It should output

To be or not to be,

that is % the question

Here's my source code, at the beginning, I use Python to extract the content from the .tex file, then I'm refactoring it in an easier way, use LaTeX to output the code originally, that's the reason why I meet this question.

Sorry about my poor expression :P
Thanks a lot!

Well, I don't know this environment, I will try after dinner. Thanks a lot :) — 2 days ago
You seem to wish to have linebreaks at the beginning and at the end of the argument removed instead of having them written to file. What behavior do you wish in case the argument of the write-command is empty or does contain only a single line-break, i.e., immediatewritefile{} or immediatewritefile{<line-break>} ? — yesterday

score 7 · Accepted Answer · 2018-11-21 16:57:37Z

The LaTeX kernel provides the filecontents environment to write to external files without having to worry about catcodes and such. The filecontents package does minimal changes to this environment allowing it to be used anywhere in the document (LaTeX's version can only be used in the preamble, for some reason; and allowing it to overwrite existing files, which is also disabled in LaTeX's version.

To produce

To be or not to be,

that is % the question

you use:

documentclass{article}

usepackage{filecontents}

begin{document}

begin{filecontents*}{tmp.txt}

To be or not to be,

that is % the question

end{filecontents*}

end{document}

The starred version (filecontents*) omits the heading that is printed in the standard version of the environment:

%% LaTeX2e file `tmp.txt'

%% generated by the `filecontents' environment

%% from source `test' on 2018/11/20.

%%

To be or not to be,

that is % the question

An addendum on my (admittedly lazy) answer:

If you should want to persist on reinventing the wheel (which is much more fun, I must admit), then you can create a command to take care of the catcodeing for you. Here I provide an ad hoc implementation of a verbwrite command which does the job for you.

The command syntax is somewhat like LaTeX's verb: you can use either as verbwritefile{<stuff>} or verbwritefile|<stuff>|. For the latter syntax, any character other than { can be used to delimit the contents. This character, obviously, can't appear in <stuff>. The advantage of the second syntax is that you don't have any restriction in balancing { and } inside the contents of the command.

documentclass{article}



makeatletter

longdef@ifnextchar@other@space#1#2#3{%

  letreserved@d=#1%

  defreserved@a{#2}%

  defreserved@b{#3}%

  futurelet@let@token@ifnch@other}

letkernel@ifnextchar@ifnextchar

def@ifnch@other{%

  ifx@let@tokenother@sptoken

    letreserved@c@xifnch@other

  else

    ifx@let@tokenreserved@d

      letreserved@creserved@a

    else

      letreserved@creserved@b

    fi

  fi

  reserved@c}

{catcode` =12

{globalletother@sptoken= }%

gdef@xifnch@other {futurelet@let@token@ifnch@other}}%

defverbwrite{@ifstar

  {let@ifnextchar@ifnextchar@other@spaceverbwrite@grab}%

  verbwrite@grab}

defverbwrite@grab#1{%

  begingroup

    catcode`^^M=13

    newlinechar`^^M

    letdo@makeother dospecials

    catcode`{=1

    @ifnextcharbgroup

      {%

        catcode`}=2

        verbwrite@brace{#1}%

      }%

      {%

        catcode`{=12

        verbwrite@other{#1}%

      }%

}

defverbwrite@brace#1#2{%

    immediatewrite#1{unexpanded{#2}}%

  endgroup

}

defverbwrite@other#1#2{%

  defverbwrite@delim##1##2#2{%

    immediatewrite##1{unexpanded{##2}}%

    endgroup

  }%

  verbwrite@delim#1%

}

makeatother



begin{document}

newwritefile

immediateopenoutfile=tmp.txt

tracingall

verbwrite*file{To be or not to be,

that is % the question}

verbwritefile|To be or not to be,

that is } the {question|

verbwritefile$être ou ne pas être,

вот в чем вопрос$

verbwritefile}être ou ne pas être,

вот в чем вопрос}

closeoutfile

end{document}

Please beware that I took 63 minutes to write this command, so it is certainly not what you can call robust. Proceed with care :)

Fix 1: Prevent expansion of the text using ε-TeX's unexpanded (thanks to jfbu :)

Fix 2: Prevent premature tokenization of the delimiter (thanks again to jfbu :)

Feature 1: Added a starred version that ignores spaces before the delimiter of the verbatim content.

Fix 3: Actually allow } as a "other" delimiter (verbwritefile}stuff}) (thanks to Ulrich Diez :)

Try this with être ou ne pas être. You will need to add usepackage[T1](fontenc} (assuming pdflatex here). And this will only fix those benign letters, add then a Unicode letter for Cyrillic for example. On the other hand the filecontents* environment does not have that problem...even for быть или не быть — 2 days ago
@jfbu It was creating content! The code has developed consciousness! Thanks for the warning, I hadn't realised that :-) — 2 days ago
Keep in mind, it could be worse if @egreg was around, and allow me anothe remark: try it with & or $ as delimiters... — 2 days ago
@jfbu Oops, that one was bad. Hopefully fixed now. Thanks :) And yes, egreg would most certainly spot a missing % ;) — 2 days ago

score 2 · Answer 2 · 2018-11-21 18:40:00Z

I suggest using the filecontents*-environment.

Be aware that there is also a LaTeX 2ε-package filecontents which does remove some of the limitations that come along with the filecontents*-environment from the LaTeX 2ε-kernel.

If you are in the mood for reinventing the wheel, you can write a macro which does

switch to verbatim-catcode-régime,

switch the catcode of the endlinechar (usually ^^M/ASCII-Return) to 12 so that ASCII-return is treated like digits and punctuation-marks,

read and tokenize under that catcode-régime the argument containing the text that is to be written to file

trim leading and trailing endline-chars from that text

write the text to file while having endlinechar also as newlinechar.

In (La)TeX there are several stages of processing input.

(La)TeX does read TeX-input, e.g., a .tex-input-file, line by line.

In the pre-processing-stage, the single characters that form the line will be converted to (La)TeX' internal character encoding. (With old-school (La)TeX engines, the internal character-encoding is ASCII. With engines based on XeTeX or LuaTeX, the internal character-encoding is utf-8 whereof ASCII is a subset.) Then all space-characters (code-point-number 32 both in ASCII and in utf-8, i.e., in all encodings that come into question as internal-character encoding of a (La)TeX engine) that occur at the right end of the line will be removed. Then a character will be inserted at the right end of the line whose code-point-number in (La)TeX' internal character-encoding (i.e. ASCII or utf-8) corresponds to the number of the integer-parameter endlinechar. Usually the value of the integer-parameter endlinechar is 13 while code-point-number 13 both in ASCII and in utf-8, i.e., in all encodings that come into question as internal-character encoding of a (La)TeX engine, denotes the &langle;RETURN&rangle;-character. This means: Usually a &langle;RETURN&rangle;-character gets inserted at the right end of the line.

When this is done, the tokenizing-stage begins: In this stage (La)TeX takes the characters that form the line for instructions for placing tokens into the token-stream. This is the stage when things start to be about so-called tokens, e.g., control-sequence-tokens (which come in two flavors: control-word-tokens and control-symbol-tokens) and character-tokens. Character-tokens consist of character-codes denoting the code-point-number in the (La)TeX' internal character-encoding and category-codes. Category-codes make it possible for characters to have special meanings for the (La)TeX-engine. E.g., the category-code of the backslash-character usually is 0(escape). A character whose category-code is 0 at tokenizing-time causes (La)TeX to gather the name of a control-sequence-token and afterwards place that control-sequence-token into the token-stream. E.g., the category-code of the opening curly brace usually is 1(begin grouping) and the category-code of the closing curly brace usually is 2(end grouping) while character-tokens of category-code 1(begin grouping) are to be used for introducing groups (i.e., macro arguments consisting of several tokens or local-scopes for assignments like macro-definitions or the &langle;balanced text&rangle; with things like scantokens) and character-tokens of category-code 2(end grouping) are to be used for denoting what does not belong to the group in question any more. More information about category-codes can be found at https://en.wikibooks.org/wiki/TeX/catcode.

After tokenizing, there is a "stream of tokens". Processing the stream of tokens includes things like expansion of expandable tokens (e.g., macro-tokens, e.g., expandable primitives like string or csname...endcsname) and (later) carrying out assignments, creating boxes etc.

When reading and tokenizing a .tex-input-file, (La)TeX will— during the pre-processing-stage— remove spaces at every line-ending and insert an endline-character at every line-ending.

Therefore the input-sequence

immediatewritefile{

To be or not to be,

that is % the question

}

will by (La)TeX at tokenizing-time, i.e., after pre-processing, be treated as

immediatewritefile{&langle;character due to endline-char-insertion&rangle;

To be or not to be,&langle;character due to endline-char-insertion&rangle;

that is % the question&langle;character due to endline-char-insertion&rangle;

}&langle;character due to endline-char-insertion&rangle;

Usually the endline-character is ^^M, i.e., &langle;RETURN&rangle;.

Thus the above input-sequence usually will by (La)TeX at tokenizing-time be treated as

immediatewritefile{&langle;^^M/RETURN-character&rangle;

To be or not to be,&langle;^^M/RETURN-character&rangle;

that is % the question&langle;^^M/RETURN-character&rangle;

}&langle;^^M/RETURN-character&rangle;

(The answer to the question which tokens (La)TeX will insert into the token-stream when encountering a &langle;^^M/RETURN-character&rangle; depends on the category-code which at the time of tokenizing is assigned to the &langle;^^M/RETURN-character&rangle;.

Usually the category-code of the &langle;^^M/RETURN-character&rangle; is 5 (end of line) which means that depending on the state of (La)TeX' reading apparatus either (in state S=skipping blanks) no token at all or (in state M=in the middle of a line) a space-token(=a character-token of category-code 10(space) and character-code 32 (32 is the number of the space-character in (La)TeX' internal character-encoding) or (in state N=about to begin new line) a par-token will be inserted.

In case category code 12(other) is assigned to the &langle;^^M/RETURN-character&rangle;, (La)TeX will insert a character-token of category-code 12(other) and character-code 13 (13 is the number of the &langle;RETURN-character&rangle;, in (La)TeX' internal character-encoding) into the token-stream. Such a token can be processed as any other character token.)

Besides this, (La)TeX will—at writing-time—in any case attach at the end of the argument of a write-command that sequence of characters/bytes that on the platform in use serves for ending lines within plain text files.

Thus—assuming that we managed to have LaTeX accept the percent-char as an ordinary character—the write-command will get something like:

&langle;token due to ^^M/RETURN-character&rangle;To be or not to be,&langle;token due to ^^M/RETURN-character&rangle;that is % the question&langle;token due to ^^M/RETURN-character&rangle;

Att writing-time, a

&langle;platform-dependent sequence for ending the line&rangle;

will be attached.

If the category code of the endline-character/of the &langle;^^M/RETURN-character&rangle; was 5(end of line) at the time of tokenizing the input, the sequence

&langle;space&rangle;To be or not to be,&langle;space&rangle;that is % the question&langle;space&rangle;&langle;platform-dependent sequence for ending the line&rangle;

will be written to the external file.

If the category code of the endline-character/of the &langle;^^M/RETURN-character&rangle; was 12(return) at the time of tokenizing the input, the sequence

^^MTo be or not to be,^^Mthat is % the question^^M&langle;platform-dependent sequence for ending the line&rangle;

will be written to the external file.

You can ensure that at writing-time a &langle;^^M/RETURN-character&rangle; also yields the &langle;platform-dependent sequence for ending the line&rangle; by assigning the integer-parameter newlinechar the value of the integer-parameter endlinechar.

If you do this also, the sequence

&langle;platform-dependent sequence for ending the line&rangle;To be or not to be,&langle;platform-dependent sequence for ending the line&rangle;that is % the question&langle;platform-dependent sequence for ending the line&rangle;&langle;platform-dependent sequence for ending the line&rangle;

will be written to the external file.

But this way you might get undesired empty lines.

Therefore you may wish to apply a routine for removing leading and trailing &langle;characters due to endline-char-insertion&rangle; from the entire argument before letting write do the writing-job.

A coding-example could look like this:

documentclass{article}



makeatletter



begingroup

catcode`^^M=12relax%

@firstofone{%

  endgroup%

  newcommand*gobbleendl{}defgobbleendl ^^M{}%

  newcommandtrimendls[2]{innertrimleadendl{#2}#1^^Mrelax{#1}}%

  newcommand*innertrimleadendl{}%

  definnertrimleadendl#1#2^^M#3relax#4{%

    ifxrelax#2relaxexpandafter@firstoftwoelseexpandafter@secondoftwofi%

    {%

      ifxrelax#4relaxexpandafter@firstoftwoelseexpandafter@secondoftwofi%

      {trimtrailendl{}{#1}}%

      {expandaftertrimtrailendlexpandafter{gobbleendl#4}{#1}}%

    }%

    {trimtrailendl{#4}{#1}}%

  }%

  newcommand*trimtrailendl[2]{%

    innertrimtrailendl{#2}.#1relax.^^Mrelax.relaxrelax{#1}%

  }%

  newcommand*innertrimtrailendl{}%

  definnertrimtrailendl#1#2^^Mrelax.#3relaxrelax#4{%

    ifxrelax#3relaxexpandafter@firstoftwoelseexpandafter@secondoftwofi%

    {def@tempa{#4}}%

    {expandafterdefexpandafter@tempaexpandafter{@gobble#2}}%

    @onelevel@sanitize@tempa%

    newlinechar=endlinechar%

    immediatewrite#1{@tempa}%

  }%

}%



newcommandimmediateverbatimwrite[1]{%

  begingroup

  letdo=@makeother

  dospecials

  catcode` =10 %We don't want to allow space as verb-arg-delimiter.

                 %Thus let's remove spaces when grabbing undelimited arguments.

  %endlinechar=`^^M%

  %catcode`endlinechar=5 %

  bracefork{#1}%

}%

begingroup

catcode`(=1 %

catcode`{=12 %

@firstofone(%

  endgroup

  newcommandbracefork[2](%

    catcode` =12relax

    catcodeendlinechar=12 %

    ifx{#2expandafter@firstoftwoelseexpandafter@secondoftwofi

    (%

      catcode`{=1 %

      catcode`}=2 %

      internalfilewritercaller(#1}(}%

    }(%

      internalfilewritercaller(#1}(#2}%

    }%

  }%

}%

newcommandinternalfilewritercaller[2]{%

  def@tempa##1#2{internalfilewriter{#1}{##1}}%

  ifxrelax#2relaxexpandafter@firstoftwoelseexpandafter@secondoftwofi

  {expandafterexpandafter

   expandafter@tempa

   expandafterexpandafter

   expandafter{%

   expandafter@gobblestring}}%

  {@tempa}%

}

newcommandinternalfilewriter[2]{%

  trimendls{#2}{#1}%

  endgroup

}%

makeatother



begin{document}



newwritefile

immediateopenoutfile=tmp.txtrelax



Aimmediateverbatimwrite{file}

{

être ou ne pas être.

That is % the question.

}B%

C%

%

Dimmediateverbatimwrite{file}  |

}être ou ne pas être.

That is % the question.

|E%

F



immediatecloseoutfile



end{document}

With this example you get

a pdf-file with the sequence ABCDEF. (This shows that no spurious spaces/whatsoever characters get introduced/inserted.)

a text-file whose name is tmp.txt and whose content is:

être ou ne pas être.&langle;linebreak&rangle;

That is % the question.&langle;linebreak&rangle;

}être ou ne pas être.&langle;linebreak&rangle;

That is % the question.&langle;linebreak&rangle;

Due to the linebreaks, editors which also show line-numbers might display that file as

1 être ou ne pas être.

2 That is % the question.

3 }être ou ne pas être.

4 That is % the question.

5

By the way: With (La)TeX it is not possible to keep spaces at the ends of lines.

The reason is that (La)TeX does read and tokenize input line by line and one of the first things it does (in the pre-processing-stage) to every line of input (even before adding the endline-character and starting tokenizing the line) is removing all spaces that occur at the ends of lines.

Thus (La)TeX input like

code&langle;space&rangle;&langle;space&rangle;

more code&langle;space&rangle;&langle;space&rangle;&langle;space&rangle;&langle;space&rangle;&langle;space&rangle;

even more code&langle;space&rangle;&langle;space&rangle;

will in any case be pre-processed to

code&langle;character due to endline-char-insertion&rangle;more code&langle;character due to endline-char-insertion&rangle;even more code&langle;character due to endline-char-insertion&rangle;

before any further processing/tokenization etc takes place.

score 7 · Accepted Answer · 2018-11-21 16:57:37Z

The LaTeX kernel provides the filecontents environment to write to external files without having to worry about catcodes and such. The filecontents package does minimal changes to this environment allowing it to be used anywhere in the document (LaTeX's version can only be used in the preamble, for some reason; and allowing it to overwrite existing files, which is also disabled in LaTeX's version.

To produce

To be or not to be,

that is % the question

you use:

documentclass{article}

usepackage{filecontents}

begin{document}

begin{filecontents*}{tmp.txt}

To be or not to be,

that is % the question

end{filecontents*}

end{document}

The starred version (filecontents*) omits the heading that is printed in the standard version of the environment:

%% LaTeX2e file `tmp.txt'

%% generated by the `filecontents' environment

%% from source `test' on 2018/11/20.

%%

To be or not to be,

that is % the question

An addendum on my (admittedly lazy) answer:

If you should want to persist on reinventing the wheel (which is much more fun, I must admit), then you can create a command to take care of the catcodeing for you. Here I provide an ad hoc implementation of a verbwrite command which does the job for you.

The command syntax is somewhat like LaTeX's verb: you can use either as verbwritefile{<stuff>} or verbwritefile|<stuff>|. For the latter syntax, any character other than { can be used to delimit the contents. This character, obviously, can't appear in <stuff>. The advantage of the second syntax is that you don't have any restriction in balancing { and } inside the contents of the command.

documentclass{article}



makeatletter

longdef@ifnextchar@other@space#1#2#3{%

  letreserved@d=#1%

  defreserved@a{#2}%

  defreserved@b{#3}%

  futurelet@let@token@ifnch@other}

letkernel@ifnextchar@ifnextchar

def@ifnch@other{%

  ifx@let@tokenother@sptoken

    letreserved@c@xifnch@other

  else

    ifx@let@tokenreserved@d

      letreserved@creserved@a

    else

      letreserved@creserved@b

    fi

  fi

  reserved@c}

{catcode` =12

{globalletother@sptoken= }%

gdef@xifnch@other {futurelet@let@token@ifnch@other}}%

defverbwrite{@ifstar

  {let@ifnextchar@ifnextchar@other@spaceverbwrite@grab}%

  verbwrite@grab}

defverbwrite@grab#1{%

  begingroup

    catcode`^^M=13

    newlinechar`^^M

    letdo@makeother dospecials

    catcode`{=1

    @ifnextcharbgroup

      {%

        catcode`}=2

        verbwrite@brace{#1}%

      }%

      {%

        catcode`{=12

        verbwrite@other{#1}%

      }%

}

defverbwrite@brace#1#2{%

    immediatewrite#1{unexpanded{#2}}%

  endgroup

}

defverbwrite@other#1#2{%

  defverbwrite@delim##1##2#2{%

    immediatewrite##1{unexpanded{##2}}%

    endgroup

  }%

  verbwrite@delim#1%

}

makeatother



begin{document}

newwritefile

immediateopenoutfile=tmp.txt

tracingall

verbwrite*file{To be or not to be,

that is % the question}

verbwritefile|To be or not to be,

that is } the {question|

verbwritefile$être ou ne pas être,

вот в чем вопрос$

verbwritefile}être ou ne pas être,

вот в чем вопрос}

closeoutfile

end{document}

Please beware that I took 63 minutes to write this command, so it is certainly not what you can call robust. Proceed with care :)

Fix 1: Prevent expansion of the text using ε-TeX's unexpanded (thanks to jfbu :)

Fix 2: Prevent premature tokenization of the delimiter (thanks again to jfbu :)

Feature 1: Added a starred version that ignores spaces before the delimiter of the verbatim content.

Fix 3: Actually allow } as a "other" delimiter (verbwritefile}stuff}) (thanks to Ulrich Diez :)

Try this with être ou ne pas être. You will need to add usepackage[T1](fontenc} (assuming pdflatex here). And this will only fix those benign letters, add then a Unicode letter for Cyrillic for example. On the other hand the filecontents* environment does not have that problem...even for быть или не быть — 2 days ago
@jfbu It was creating content! The code has developed consciousness! Thanks for the warning, I hadn't realised that :-) — 2 days ago
Keep in mind, it could be worse if @egreg was around, and allow me anothe remark: try it with & or $ as delimiters... — 2 days ago
@jfbu Oops, that one was bad. Hopefully fixed now. Thanks :) And yes, egreg would most certainly spot a missing % ;) — 2 days ago

score 2 · Answer 4 · 2018-11-21 18:40:00Z

I suggest using the filecontents*-environment.

Be aware that there is also a LaTeX 2ε-package filecontents which does remove some of the limitations that come along with the filecontents*-environment from the LaTeX 2ε-kernel.

If you are in the mood for reinventing the wheel, you can write a macro which does

switch to verbatim-catcode-régime,

switch the catcode of the endlinechar (usually ^^M/ASCII-Return) to 12 so that ASCII-return is treated like digits and punctuation-marks,

read and tokenize under that catcode-régime the argument containing the text that is to be written to file

trim leading and trailing endline-chars from that text

write the text to file while having endlinechar also as newlinechar.

In (La)TeX there are several stages of processing input.

(La)TeX does read TeX-input, e.g., a .tex-input-file, line by line.

In the pre-processing-stage, the single characters that form the line will be converted to (La)TeX' internal character encoding. (With old-school (La)TeX engines, the internal character-encoding is ASCII. With engines based on XeTeX or LuaTeX, the internal character-encoding is utf-8 whereof ASCII is a subset.) Then all space-characters (code-point-number 32 both in ASCII and in utf-8, i.e., in all encodings that come into question as internal-character encoding of a (La)TeX engine) that occur at the right end of the line will be removed. Then a character will be inserted at the right end of the line whose code-point-number in (La)TeX' internal character-encoding (i.e. ASCII or utf-8) corresponds to the number of the integer-parameter endlinechar. Usually the value of the integer-parameter endlinechar is 13 while code-point-number 13 both in ASCII and in utf-8, i.e., in all encodings that come into question as internal-character encoding of a (La)TeX engine, denotes the &langle;RETURN&rangle;-character. This means: Usually a &langle;RETURN&rangle;-character gets inserted at the right end of the line.

When this is done, the tokenizing-stage begins: In this stage (La)TeX takes the characters that form the line for instructions for placing tokens into the token-stream. This is the stage when things start to be about so-called tokens, e.g., control-sequence-tokens (which come in two flavors: control-word-tokens and control-symbol-tokens) and character-tokens. Character-tokens consist of character-codes denoting the code-point-number in the (La)TeX' internal character-encoding and category-codes. Category-codes make it possible for characters to have special meanings for the (La)TeX-engine. E.g., the category-code of the backslash-character usually is 0(escape). A character whose category-code is 0 at tokenizing-time causes (La)TeX to gather the name of a control-sequence-token and afterwards place that control-sequence-token into the token-stream. E.g., the category-code of the opening curly brace usually is 1(begin grouping) and the category-code of the closing curly brace usually is 2(end grouping) while character-tokens of category-code 1(begin grouping) are to be used for introducing groups (i.e., macro arguments consisting of several tokens or local-scopes for assignments like macro-definitions or the &langle;balanced text&rangle; with things like scantokens) and character-tokens of category-code 2(end grouping) are to be used for denoting what does not belong to the group in question any more. More information about category-codes can be found at https://en.wikibooks.org/wiki/TeX/catcode.

After tokenizing, there is a "stream of tokens". Processing the stream of tokens includes things like expansion of expandable tokens (e.g., macro-tokens, e.g., expandable primitives like string or csname...endcsname) and (later) carrying out assignments, creating boxes etc.

When reading and tokenizing a .tex-input-file, (La)TeX will— during the pre-processing-stage— remove spaces at every line-ending and insert an endline-character at every line-ending.

Therefore the input-sequence

immediatewritefile{

To be or not to be,

that is % the question

}

will by (La)TeX at tokenizing-time, i.e., after pre-processing, be treated as

immediatewritefile{&langle;character due to endline-char-insertion&rangle;

To be or not to be,&langle;character due to endline-char-insertion&rangle;

that is % the question&langle;character due to endline-char-insertion&rangle;

}&langle;character due to endline-char-insertion&rangle;

Usually the endline-character is ^^M, i.e., &langle;RETURN&rangle;.

Thus the above input-sequence usually will by (La)TeX at tokenizing-time be treated as

immediatewritefile{&langle;^^M/RETURN-character&rangle;

To be or not to be,&langle;^^M/RETURN-character&rangle;

that is % the question&langle;^^M/RETURN-character&rangle;

}&langle;^^M/RETURN-character&rangle;

(The answer to the question which tokens (La)TeX will insert into the token-stream when encountering a &langle;^^M/RETURN-character&rangle; depends on the category-code which at the time of tokenizing is assigned to the &langle;^^M/RETURN-character&rangle;.

Usually the category-code of the &langle;^^M/RETURN-character&rangle; is 5 (end of line) which means that depending on the state of (La)TeX' reading apparatus either (in state S=skipping blanks) no token at all or (in state M=in the middle of a line) a space-token(=a character-token of category-code 10(space) and character-code 32 (32 is the number of the space-character in (La)TeX' internal character-encoding) or (in state N=about to begin new line) a par-token will be inserted.

In case category code 12(other) is assigned to the &langle;^^M/RETURN-character&rangle;, (La)TeX will insert a character-token of category-code 12(other) and character-code 13 (13 is the number of the &langle;RETURN-character&rangle;, in (La)TeX' internal character-encoding) into the token-stream. Such a token can be processed as any other character token.)

Besides this, (La)TeX will—at writing-time—in any case attach at the end of the argument of a write-command that sequence of characters/bytes that on the platform in use serves for ending lines within plain text files.

Thus—assuming that we managed to have LaTeX accept the percent-char as an ordinary character—the write-command will get something like:

&langle;token due to ^^M/RETURN-character&rangle;To be or not to be,&langle;token due to ^^M/RETURN-character&rangle;that is % the question&langle;token due to ^^M/RETURN-character&rangle;

Att writing-time, a

&langle;platform-dependent sequence for ending the line&rangle;

will be attached.

If the category code of the endline-character/of the &langle;^^M/RETURN-character&rangle; was 5(end of line) at the time of tokenizing the input, the sequence

&langle;space&rangle;To be or not to be,&langle;space&rangle;that is % the question&langle;space&rangle;&langle;platform-dependent sequence for ending the line&rangle;

will be written to the external file.

If the category code of the endline-character/of the &langle;^^M/RETURN-character&rangle; was 12(return) at the time of tokenizing the input, the sequence

^^MTo be or not to be,^^Mthat is % the question^^M&langle;platform-dependent sequence for ending the line&rangle;

will be written to the external file.

You can ensure that at writing-time a &langle;^^M/RETURN-character&rangle; also yields the &langle;platform-dependent sequence for ending the line&rangle; by assigning the integer-parameter newlinechar the value of the integer-parameter endlinechar.

If you do this also, the sequence

&langle;platform-dependent sequence for ending the line&rangle;To be or not to be,&langle;platform-dependent sequence for ending the line&rangle;that is % the question&langle;platform-dependent sequence for ending the line&rangle;&langle;platform-dependent sequence for ending the line&rangle;

will be written to the external file.

But this way you might get undesired empty lines.

Therefore you may wish to apply a routine for removing leading and trailing &langle;characters due to endline-char-insertion&rangle; from the entire argument before letting write do the writing-job.

A coding-example could look like this:

documentclass{article}



makeatletter



begingroup

catcode`^^M=12relax%

@firstofone{%

  endgroup%

  newcommand*gobbleendl{}defgobbleendl ^^M{}%

  newcommandtrimendls[2]{innertrimleadendl{#2}#1^^Mrelax{#1}}%

  newcommand*innertrimleadendl{}%

  definnertrimleadendl#1#2^^M#3relax#4{%

    ifxrelax#2relaxexpandafter@firstoftwoelseexpandafter@secondoftwofi%

    {%

      ifxrelax#4relaxexpandafter@firstoftwoelseexpandafter@secondoftwofi%

      {trimtrailendl{}{#1}}%

      {expandaftertrimtrailendlexpandafter{gobbleendl#4}{#1}}%

    }%

    {trimtrailendl{#4}{#1}}%

  }%

  newcommand*trimtrailendl[2]{%

    innertrimtrailendl{#2}.#1relax.^^Mrelax.relaxrelax{#1}%

  }%

  newcommand*innertrimtrailendl{}%

  definnertrimtrailendl#1#2^^Mrelax.#3relaxrelax#4{%

    ifxrelax#3relaxexpandafter@firstoftwoelseexpandafter@secondoftwofi%

    {def@tempa{#4}}%

    {expandafterdefexpandafter@tempaexpandafter{@gobble#2}}%

    @onelevel@sanitize@tempa%

    newlinechar=endlinechar%

    immediatewrite#1{@tempa}%

  }%

}%



newcommandimmediateverbatimwrite[1]{%

  begingroup

  letdo=@makeother

  dospecials

  catcode` =10 %We don't want to allow space as verb-arg-delimiter.

                 %Thus let's remove spaces when grabbing undelimited arguments.

  %endlinechar=`^^M%

  %catcode`endlinechar=5 %

  bracefork{#1}%

}%

begingroup

catcode`(=1 %

catcode`{=12 %

@firstofone(%

  endgroup

  newcommandbracefork[2](%

    catcode` =12relax

    catcodeendlinechar=12 %

    ifx{#2expandafter@firstoftwoelseexpandafter@secondoftwofi

    (%

      catcode`{=1 %

      catcode`}=2 %

      internalfilewritercaller(#1}(}%

    }(%

      internalfilewritercaller(#1}(#2}%

    }%

  }%

}%

newcommandinternalfilewritercaller[2]{%

  def@tempa##1#2{internalfilewriter{#1}{##1}}%

  ifxrelax#2relaxexpandafter@firstoftwoelseexpandafter@secondoftwofi

  {expandafterexpandafter

   expandafter@tempa

   expandafterexpandafter

   expandafter{%

   expandafter@gobblestring}}%

  {@tempa}%

}

newcommandinternalfilewriter[2]{%

  trimendls{#2}{#1}%

  endgroup

}%

makeatother



begin{document}



newwritefile

immediateopenoutfile=tmp.txtrelax



Aimmediateverbatimwrite{file}

{

être ou ne pas être.

That is % the question.

}B%

C%

%

Dimmediateverbatimwrite{file}  |

}être ou ne pas être.

That is % the question.

|E%

F



immediatecloseoutfile



end{document}

With this example you get

a pdf-file with the sequence ABCDEF. (This shows that no spurious spaces/whatsoever characters get introduced/inserted.)

a text-file whose name is tmp.txt and whose content is:

être ou ne pas être.&langle;linebreak&rangle;

That is % the question.&langle;linebreak&rangle;

}être ou ne pas être.&langle;linebreak&rangle;

That is % the question.&langle;linebreak&rangle;

Due to the linebreaks, editors which also show line-numbers might display that file as

1 être ou ne pas être.

2 That is % the question.

3 }être ou ne pas être.

4 That is % the question.

5

By the way: With (La)TeX it is not possible to keep spaces at the ends of lines.

The reason is that (La)TeX does read and tokenize input line by line and one of the first things it does (in the pre-processing-stage) to every line of input (even before adding the endline-character and starting tokenizing the line) is removing all spaces that occur at the ends of lines.

Thus (La)TeX input like

code&langle;space&rangle;&langle;space&rangle;

more code&langle;space&rangle;&langle;space&rangle;&langle;space&rangle;&langle;space&rangle;&langle;space&rangle;

even more code&langle;space&rangle;&langle;space&rangle;

will in any case be pre-processed to

code&langle;character due to endline-char-insertion&rangle;more code&langle;character due to endline-char-insertion&rangle;even more code&langle;character due to endline-char-insertion&rangle;

before any further processing/tokenization etc takes place.

搜尋此網誌

Krdytkyu

immediatewrite with plain text

2 Answers
2

Your Answer

Post as a guest

2 Answers
2

2 Answers
2

Post as a guest

Popular posts from this blog

Orthoptera

Ellipse (mathématiques)

Quarter-circle Tiles

immediatewrite with plain text

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

2 Answers 2

2 Answers 2

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

Orthoptera

Ellipse (mathématiques)

Quarter-circle Tiles

2 Answers
2

2 Answers
2

2 Answers
2