MJ Shea Consulting and Design Wiki

Introduction

A long-long time ago while working at IBM, I liked to use to use a program called GML to format my c-language source code for printing. By adding a few simple tags inclosed in c-language comments, my printouts would include a title page, table of contents, and page numbers. You would think that in this time of you can find anything in the internet, that someone would have written a GML processor for Windows or Linux. If they have, I have not seen it.

Instead, I created a fairly simple program in C that reads my source code file with a few added tags and outputs a file that AsciiDoc uses as input.

To describe how the program works, I will naturally use a simple "Hello World" example.

Hello World

//------------------------------------------------------------------------------
// File Name:   helloworld.c
//
// Description: This program will print "Hello, World!".
//------------------------------------------------------------------------------

//------------------------------------------------------------------------------
// Includes
#include <stdio.h>

//------------------------------------------------------------------------------
// main
void main()
{
   printf("Hello, World!");
   return;
}

As input, AsciiDoc requires a header section and special heading markup delineating chapters. It will print anything enclosed in four dashes ("----") exactly as entered. To format the helloworld.c example above to work with AsciiDoc, the following additions will format the output as I want.

AsciiDoc Formatted Hello World

 helloworld.c
 ============
 :toc:
 :toc-placement: manual
 :numbered:

 ----
 //------------------------------------------------------------------------------
 // File Name:   helloworld.c
 //
 // Description: This program will print "Hello, World!".
 //------------------------------------------------------------------------------
 ----
 toc:[]

 == Includes
 ----
 //------------------------------------------------------------------------------
 // Includes
 #include <stdio.h>
 ----
 == main
 ----
 //------------------------------------------------------------------------------
 // main
 void main()
 {
    printf("Hello, World!");
    return;
 }
 ----

With these tags added, AsciiDoc can create HTML output like this (link).

Here is an explanation of the tags that were added.

The lines added to the beginning of the file give the output file a title and describes how to create a table of contents. The four dashes indicate what what follows should be printed without changing the format. The unformatted section is called a "listing block."

 HelloWorld.c
 ============
 :toc:
 :toc-placement: manual
 :numbered:

 ----

After the introductory information is listed, the next tags end the listing block, place the table of contents, and starts the first chapter called "Includes." Then, another listing block is started.

 ----
 toc:[]

 == Includes
 ----

After listing the includes, a new chapter is started called "main."

 ----
 == main
 ----

At the end of the file, the listing block is closed.

 ----

The C2AsciiDoc Program

The C2AsciiDoc c-language program will read a text file with a few added c-style comment tags and creates and AsciiDoc-ready output file. AsciiDoc can be used to create an HTML file with the original text along with chapters and a table of contents

The Tags

The tags used by C2AsciiDoc are placed inside c-language comments. This allows C2AsciiDoc to add the AsciiDoc tags while not impacting the behavior of compilers.

The table below shows the tags C2AsciiDoc uses. Anyone familiar with IBM GML will recognize these tags.

Table 1. C2AsciiDoc Tags
C2AsciiDoc Tag	AsciiDoc Tag	Description
/* .TOC */	:toc: and toc:[]	Place Table of Contents
/* .H1 <title> */	== <title>	Create Level 1 Heading
/* .H2 <title> */	=== <title>	Create Level 2 Heading
/* .H3 <title> */	==== <title>	Create Level 3 Heading

Here is the hellowworld.c example above with these tags added. Since the added tags are all enclosed in comments, they will not effect how the program compiles.

C2AsciiDoc Tags Added to Hello World

//------------------------------------------------------------------------------
// File Name:   helloworld.c
//
// Description: This program will print "Hello, World!".
//------------------------------------------------------------------------------

/* .TOC */
/* .H1 Includes */
//------------------------------------------------------------------------------
// Includes
#include <stdio.h>

/* .H1 main */
//------------------------------------------------------------------------------
// main
void main()
{
   printf("Hello, World!");
   return;
}

C2AsciiDoc Program

Here is how the C2AsciiDoc program works.

First, the C2AsciiDoc program opens the text file for reading and creates an output file for writing. It names the output file the same as the input file and appends a new file type ".adoc" to the end. For helloworld.c it creates helloworld.c.adoc.

    //------------------------------------------------------------
    // Input file name.
    infilename = argv[1];
    if (0 != access(infilename, F_OK))
    {
        fprintf(stderr, "Error: File %s does not exist.\n", infilename);
        return 1;
    }

    //------------------------------------------------------------
    // Open the input file.
    infile = fopen (infilename, "r" );
    if (NULL == infile)
    {
        fprintf(stderr,
                "Error: Unable to open fFile %s with read access.\n",
                infilename);
        return 1;
    }

    //------------------------------------------------------------
    // Open the output file.
    outfilename = malloc(strlen(infilename) + strlen(ADOC_EXT) + 1);
    if (NULL == outfilename)
    {
        fprintf(stderr,
                "Error: Unable to allocate memory for %s%s filename.\n",
                outfilename,
                ADOC_EXT);
        return 1;
    }

    sprintf(outfilename, "%s%s", infilename, ADOC_EXT);
    outfile = fopen (outfilename, "w" );
    if (NULL == infile)
    {
        fprintf(stderr,
                "Error: Unable to open file %s with write access.\n",
                outfilename);
        free(outfilename);
        return 1;
    }
    free(outfilename);

Next, it reads through the input file and looks for the ".TOC" tag. If found, it knows where to put the table of contents.

    //------------------------------------------------------------------------
    // Read through the input file looking for TOC and then rewind it to
    // the beginning
    toc_found = false;
    while(NULL != fgets(line,sizeof line, infile))
    {
        // Look for the TOC tag
        if (NULL != strstr(line, "/* .TOC */"))
        {
            toc_found = true;
            break;
        }
    }

If no ".TOC" tag is found, it will tell AsciiDoc to put the table of contents to the left of the code listing. In other words, if a ".TOC" tag was found, a :toc: tag will be written to the output file. If no ".TOC" tag is found, a :toc2: tag will be written. The former says to include the table of contents in line with the output. The later says to put the table of contents to the left of the the output.

ow knowing the name of the file and where you want to put the table of contents, the header information is written to the output file and four dashes are written to open a listing block.

    //------------------------------------------------------------------------
    // Write the header - Print the title underlined with equal signs
    fprintf(outfile, "%s\n", infilename);
    for (ii = 0; ii < strlen(infilename); ii++)
    {
        fprintf(outfile, "=");
    }
    fprintf(outfile, "\n");

    // If the TOC tag was found, place where requested.  Else, place it
    // in the left column
    if (true == toc_found)
    {
        fprintf(outfile, ":toc:\n");
        fprintf(outfile, ":toc-placement: manual\n");
    }
    else
    {
        fprintf(outfile, ":toc2:\n");
    }

    // Number the chapter titles
    fprintf(outfile, ":numbered:\n");

    // Open the listing block for our file text
    fprintf(outfile, "\n----\n");
    example_open = true;

Next, starting at the top of the file, the program reads your source code line-by-line looking for the tags listed above.

    //------------------------------------------------------------------------
    // Move back to the beginning of the input file
    if ( fseek(infile, 0L, SEEK_SET) != 0 )
    {
        fprintf(stderr, "Error: Unable to rewind %s.\n", infilename);
        return 1;
    }

    //-----------------------------------------------------------------------
    // Read the input file line-by-line until the end of the file is reached
    // NULL will be returned once we hit the end
    while(NULL != fgets(line,sizeof line, infile))
    {

        //---------------------------------------------------
        // Look for the header tag
        head_front = strstr(line, "/* .H");
        head_end   = strstr(line, " */" );
        if ((NULL != head_front) && (NULL != head_end))
        {
            // Read the header level which is right
            // after the '/* H.'
            level  = (int)head_front[5] - (int)'0';
            header = true;
        }
        else
        {
            level  = 0;
            header = false;
        }

        //---------------------------------------------------
        // Look for the TOC tag
        if (NULL != strstr(line, "/* .TOC */"))
        {
            toc = true;
        }
        else
        {
            toc = false;
        }

        ...

If it finds one of the tags, the listing block is closed, the AsciiDoc information is written, and a new listing block is started.

To prevent empty listing blocks, as might happen if you put a ".H1" tag right after a ".TOC" tag, the program keeps track of whether the listing block is open or closed using the example_open boolean value. When a tag is encountered, the listing block is closed only if it is opened. Likewise, when source data is found after a tag, a new listing block is started only if it is currently closed.

To force a page break when you use AsciiDoc to create a PDF document, a "<<<" tag is added to the output. This addition is ignored when formatting as markup or HTML.

        //---------------------------------------------------
        // Write to the output file
        if (true == header)
        {
            // Output the asciidoc text for a header
            // First close the example block if it is open
            if (true == example_open)
            {
                fprintf(outfile, "----\n\n");
                example_open = false;
            }

            // Insert a page break instruction
            fprintf(outfile, "<<<\n\n");

            // Now write equal signs for the header
            fprintf(outfile, "=");
            for (ii = 0; ii < level; ii++)
            {
                fprintf(outfile, "=");
            }

            // Write the title after removing the */
            head_end[0] = '\0';
            fprintf(outfile, " %s\n\n", &head_front[7]);
        }
        else if (true == toc)
        {
            // Output the asciidoc text for the TOC
            // First close the example block if it is open
            if (true == example_open)
            {
                fprintf(outfile, "----\n\n");
                example_open = false;
            }

            // Insert a page break instruction
            fprintf(outfile, "<<<\n\n");

            // Now write the toc tag
            fprintf(outfile, "toc::[]\n\n");
        }
        else
        {
            // If no tags found, output the source text
            // First open the example block if it is closed
            if (false == example_open)
            {
                fprintf(outfile, "----\n");
                example_open = true;
            }

            // Just print the line as is,
            fprintf(outfile,"%s", line);
        }

Here is a link to the AsciiDoc HTML formatted output for this program.

And, here is a link to the source. Feel free to use this as you wish. If you download it, copy it, use it, change it, or whatever, it becomes yours.

Concluding Thoughts

You can use this program as-is for creating printable listings of source files for any language that uses c-style comments. It works with C#, Java, and Assembly (if you use GAS). It can be modified if you want to use it with Python, XML, HTML, etc…

There are times when you have to manually edit the AsciiDoc file before using AsciiDoc. For example, trying to format c2asciidoc.c does not work very well because it gets all confused when it see the strings in the code that look for the tags. There are also times when you want to use other AsciiDoc features and you don’t want to take the time to modify c2asciidoc.c.

One update I plan to make is to change tab characters into spaces. I usually set my editors to use four spaces to represent a tab. However, AsciiDoc assumes eight characters. This will make your output look kind of strange. You can specify the tab size on the AsciiDoc command line. However, I always seem to forget in my hurry to print things out.

I hope you find this utility useful. I find that people would rather read an HTML version of my programs then try to read the source directly. I also find it easier to review my code on the couch with a PDF printout and a pen then using a laptop.

Enjoy!

MJ Shea Consulting and Design

Introduction

The C2AsciiDoc Program

The Tags

C2AsciiDoc Program

Compiling C2AsciiDoc

Concluding Thoughts