NAME

HTML::Mason FAQ - frequently asked questions about HTML::Mason


DESCRIPTION

HTML::Mason is a perl-based web site development and delivery engine. This document is designed to answer questions that arise when using HTML::Mason to develop new and convert existing applications.

HTML::Mason supports embedding Perl in HTML pages. In addition to parsing and executing embedded Perl statements, HTML::Mason provides facilities for many common web development issues: templating, caching, debugging, profiling, page previewing, and more. Although it can be used with CGI or even stand-alone, HTML::Mason works best in conjunction with Apache and mod_perl.

This document is maintained by Kwindla Hultman Kramer (kwindla@allafrica.com), and is available from http://allafrica.com/tools/mason/faq.html

The faq is also available as a pod file: http://allafrica.com/tools/mason/faq.pod


WHERE TO FIND INFORMATION

Does HTML::Mason have an official web site?

Yes, at http://www.masonhq.com.

Where do I obtain HTML::Mason?

HTML::Mason is available from CPAN (the Comprehensive Perl Archive Network). Details about CPAN are available at http://www.perl.com/. See the INSTALLATION section of this document for tips on obtaining and installing Mason.

Where can I ask questions about HTML::Mason?

You are encouraged to use the HTML::Mason mailing list; that way many people can answer questions and/or benefit from the answer. Information about the list, including how to subscribe, can be found at http://lists.sourceforge.net/mailman/listinfo/mason-users. To post to the list, write to mason-users@lists.sourceforge.net.

You may want to browse past messages in the archive first: http://marc.theaimsgroup.com/?l=mason&r=1&w=2

If the question is more generally about mod_perl, you may wish to post additionally (or only) to the mod_perl mailing list at modperl@apache.org.

If you have a question or comment that you don't feel is appropriate for the group, write to swartz@pobox.com.

I think I've found a bug in Mason, what should I do?

Check the known bugs page at http://masonhq.com/resources/bugs.html to see if it's already been reported; simplify the example as much as possible to make sure it's really Mason (i.e. try doing the same thing with just mod_perl and see if it still happens); finally, send a report to the user's list.


INSTALLATION

What else do I need to use HTML::Mason?

If you are planning on using HTML::Mason in a web environment with the Apache webserver, you'll need a working copy of Apache and mod_perl installed. Make sure that your mod_perl installation works correctly before trying to get HTML::Mason working. Also, if you are running RedHat Linux, do not use the mod_perl RPMs that ship with RedHat. They are not reliable (as of RedHat 6.2).

What platforms does HTML::Mason run on?

Because HTML::Mason consists of only Perl code, it should work anywhere Perl runs (including most Unix and Win32 variants). If it doesn't work on your operating system, let us know.

Can I run HTML::Mason outside a web server?

Yes, in fact HTML::Mason can be useful for generating a set of web pages offline or as a general templating tool. See the ``Standalone Mode'' section of the Interpreter manual:

  http://www.masonhq.com/docs/manual/0.8/Interp.html#standalone_mode

Can I run HTML::Mason via CGI?

Yes. Masonite Ilmari Karonen uses HTML::Mason under CGI. He has kindly posted the necessary steps to do this on the HTML::Mason mailing list. His message can be viewed at http://forum.swarthmore.edu/epigone/mason/gimpswumfrex/Pine.SOL.3.96.990917144751.17640B-100000@simpukka

Please note that running HTML::Mason under CGI (or other non-persistent environments) will entail a substantial performance hit, since the perl interpreter will have to load, load up HTML::Mason and its supporting modules for every CGI execution. Using mod_perl or similar persistent environments (SpeedyCGI, FastCGI, etc.) avoids this performance bottleneck.

Why do Mason tests fail during install?

The user that your web server runs as (``nobody'' is a common choice, and the default on many systems), needs to be able to read the Mason build directory. If you are installing via CPAN as root, that directory is likely down /root/.cpan, which will have restrictive permissions. Try chmod'ding the build directory, or moving to a different (perhaps world-readable) build directory for the install.

Why am I getting segmentation faults (or silently failing on startup)?

There are a few known mod_perl issues that cause segmentation faults or a silent failure on the part of Apache to start itself up. Though not specific to HTML::Mason, they are worth keeping in mind:


COMPONENTS

What is a component?

A component is a file that contains some combination of text (typically HTML), perl code and HTML::Mason directives.

Some components are accessed directly by web browsers. These are called top-level components. A top-level component might consist purely of static HTML.

Other components are support components, which are called by top-level components or other support components. These components are analogous to perl subroutines -- they allow you to create small packages of code that you can reuse throughout your project.

How do components communicate with each other?

Components can return values to their callers, just like subroutines.

Some components may have very simple return values. As an example, consider a component called isNetscape which returns a true value when the client's browser is Netscape and undef when it is not. The isNetscape component could then be used easily in an if() or other control statement.

Of course, components can also return strings of text, arrays, hashes or other arbitrarily complex perl data structures.

How do I use modules in components?

Technically you can just say ``use module-name'' at the beginning of a component. The disadvantages of this method are that:

A more efficient method is to put the use line in the handler.pl or use the PerlModule directive. If you want components to be able to refer to symbols exported by the module, you need to use the module inside the HTML::Mason::Commands package. See the ``External modules'' section of the Administrator's Guide:

  http://www.masonhq.com/docs/manual/0.8/Admin.html#external_modules

Can I define subroutines in components?

First, consider using a separate component. They are functionally equivalent, and this is playing to Mason's strengths. If the subroutine is small, or limited in scope, consider using a <%def> subcomponent.

You can define named subroutines inside the <%once> section of any component. Defining the subroutine in a <%perl> or <%init> section is not reliable because such a definition would end up residing inside another subroutine, and Perl doesn't like that.

You can define anonymous subroutines anywhere:

  my $foo = sub {...};

How do I access GET or POST arguments?

GET and POST arguments are automatically parsed and placed into named component arguments just as if you had called the component with <& &> or $m->comp. So you can get at GET/POST data by pre-declaring argument names and/or using the %ARGS hash which is always available.

What happens if I include query args in a POST?

Currently, that depends on whether you are using the 'CGI' or 'mod_perl' args_method of HTML::Mason::ApacheHandler. (The default is 'CGI'.)

CGI ignores query args in a POST.

Apache::Request (used by the 'mod_perl' args_method) combines the query and POST args.

For more information on the args_method and ApacheHandler, see: http://www.masonhq.com/docs/manual/ApacheHandler.html#parameters_to_the_use_declarat

Should I use CGI.pm to read GET/POST arguments?

No! HTML::Mason automatically parses GET/POST arguments and places them in declared component arguments and %ARGS (see previous question). If you create a CGI object in the usual way for a POST request, it will hang the process trying to read $r->content a second time.

Can I use CGI.pm to output HTML constructs?

Yes. To get a new CGI object, use

  my $query = new CGI('');

You have to give the empty string argument or CGI will try to read GET/POST arguments.

To print HTML constructs returned by CGI functions, just enclose them in <%%>, e.g.

  <% $query->radio_group(...) %>

How do I modify the outgoing HTTP headers?

Use the usual Apache.pm functions, such as $r->header_out. See the ``Sending HTTP Headers'' section in the Component Developer's Guide:

  http://www.masonhq.com/docs/manual/0.8/Devel.html#sending_http_headers

How do I do an external redirect?

Denis Shaposhnikov posted the following component to handle redirects:

  === redirect.pl ===
  <%perl>
    $m->clear_buffer;
    # The next two lines are necessary to stop Apache from re-reading
    # POSTed data.
    $r->method('GET');
    $r->headers_in->unset('Content-length');
    $r->content_type('text/html');
    $r->header_out('Location' => $location);
    $m->abort(301);
  </%perl>
  %
  <%args>
    $location
  </%args>
  ===================

See the next question if your redirect isn't producing the right status code.

Why isn't my status code reaching users' browsers?

The handler sub in a handler.pl file should always return the error code that handle_request($r) produces. Otherwise, things like $m->abort() will not work correctly. So a very, very simple handler sub would look like this:

  sub handler {
    my $r = shift;
    $ah->handle_request($r);
  }

How do I exit from all components including the ones that called me?

Use $m->abort, documented in the Request manual:

  http://www.masonhq.com/docs/manual/0.8/Request.html#item_abort_return_value_

How do I put comments in components?

The difference between 3 and 4 is that %# comments will not appear in the output, while <!-- --> comments will appear. Both have their advantages.

What's a good way to temporarily comment out code in a component?

For HTML, you might be tempted to surround the section with <!-- -->. But be careful! Any code inside the section will still execute. Here's a example of commenting out a call to an ad server:

  <!-- temporarily comment out
  <& FetchAd &>
  -->

The ad will still be fetched and counted, but not displayed!

A better way to block out a section is if (0):

  % if (0) {
    ...
  % }

Code blocked out in this way will neither be executed nor displayed, and multiple if (0) blocks can be nested inside each other (unlike HTML comments).

How can I capture the output of a component (and modify it, etc.) instead of having it automatically output?

Use $m->scomp, documented in the Request manual:

  http://www.masonhq.com/docs/manual/0.8/Request.html#item_scomp

Can I use globals in components?

All HTML::Mason components run in the same package (HTML::Mason::Commands), so if you set a global variable in one you'll be able to read it in all the others. You can also initialize global variables in the handler() subroutine in handler.pl as long as you explicitly put them in the HTML::Mason::Commands package.

The only problem is that Mason by default parses components with strict mode on, so you'll get a warning about the global (and Mason considers all such warnings fatal). To get around this you need to declare the global in a ``use vars'' statement inside the HTML::Mason::Commands package:

  package HTML::Mason::Commands;
  use vars qw(...);

or use the Parser allow_globals parameter. See the section on globals in the Administrator's Guide:

  http://www.masonhq.com/docs/manual/Admin.html#using_global_variables

Alternatively you can turn off strict entirely by passing:

  use_strict => 0

when you create the Parser object. Then you can use all the globals you want. Doing this is terribly silly, however, and is bound to get you in trouble down the road.

When I change a component I don't always see the results in the browser. How do I invalidate Mason code caches?

Mason employs two kinds of code caching. First, Mason caches loaded components in memory. Second, Mason keeps an object file (a compiled version of the component) for every loaded component under data_root/obj.

Before executing a memory-cached component, Mason compares the stored timestamp with the timestamp of the source file. If the source file has a later timestamp, Mason will load the component from the filesystem.

Similarly, before using an object file, Mason compares the modified timestamp of the source and object files. If the source file has a later timestamp, then it is reparsed and the object file is overwritten.

The system is designed so that you will immediately see the effects of source file changes. There are several ways for this system to breakdown; most are easy to avoid once you know about them.

When in doubt, touching the source files (with the Unix touch command, or by re-saving in an editor) should force Mason to reload the component. If that does not work, try removing the object files and/or restarting the server to clear the memory cache. However, these remedies should be necessary only to diagnose the caching problem, not for normal Mason operation. On a normal Mason system cache expiration should just work ``as expected''.

Do data cache files expire automatically when a component or its dependencies change?

Eventually, but not right now. Unfortunately, it is difficult for Mason to check this efficiently in the current implementation. But you can use the following idiom to say ``expire when my component source file changes'':

  $m->cache(..., expire_if=>sub { (stat($m->source_file))[9] > $_[0] } )

Why does the order of output get mixed up when I use print or $r->print?

Since your server is most likely in batch mode, all Mason output gets buffered til the end of the request. print and $r->print circumvent the buffer and thus come out before other Mason output.

Solution: don't use print or $r->print. Use $m->out if you must output inside a Perl section. See the section on output mode in the Administrator's Guide:

  http://www.masonhq.com/docs/manual/Admin.html#out_mode

and the section on $m->out in the Request manual:

  http://www.masonhq.com/docs/manual/Request.html#item_out_string

How can I handle file uploads under Mason?

The way you handle this depends on which args method you chose for the ApacheHandler class.

Under the default method, 'CGI', which uses the CGI.pm module to handle incoming arguments, you can use the $m->cgi_object method to retrieve a CGI.pm object which can be used to retrieve the uploaded file(s). Please see the CGI.pm documentation for more details.

If you are using the 'mod_perl' method, which uses Apache::Request, then the request object available as $r in your components will be an object in the Apache::Request class (as opposed to the Apache class). This object is capable of returning Apache::Upload objects for parameters which were file uploads. Please see the Apache::Request documentation for more details.

If you are using CGI.pm, there are some configuration issues to be aware of. CGI.pm needs a tmp directory, and you probably want to be able to specify what that directory is. Dave Rolsky writes:

  Try doing this in your handler.pl (or somewhere):

  use CGI qw(-private_tempfiles);

  You must do this _before_ you do:

  use HTML::Mason;

  or

  use HTML::Mason::ApacheHandler;

  That may change which directories CGI tries to use.

  I think the other hack is to do:

  $CGI::TempFile::TMPDIRECTORY = '/tmp';

  during startup.

  The root of the problem is probably that the temp directory is being
  chosen when the module loads during server startup while its still root.
  It sees it can write to /usr/tmp and is happy.  Then when actually running
  as nobody it dies.

  I bet Lincoln would welcome a patch (hint, hint, someone other than me).
  One solution would be to check if you're running under mod_perl and you're
  root.  If so, then check Apache- server- uid and see if that id can write
  to the temp directory too.

  Or something like that.

For more information on the args_method and ApacheHandler, see: http://www.masonhq.com/docs/manual/ApacheHandler.html#parameters_to_the_use_declarat

How can I send a file to the browser to be downloaded, from a component?

Jonathan Swartz provides this ``make the user download $file'' boilerplate:

  my $subr = $r->lookup_file($file);
  return 404 unless -f $file and $subr->status == 200;
  $r->content_type($subr->content_type);
  $r->send_http_header;
  return 200 if $r->header_only;
  $subr->run;
  $m->abort;

Is <%args> exactly like %ARGS, and do I need to worry about it?

Mason allows you to predeclare arguments to components by specifying variables to hold those arguments in an <%args></%args> section. Because these are perl variables that you are predeclaring, they must have legal perl identifier names -- they can't, for example, contain periods.

If you want to pass arguments that are not identified with legal perl names, you must manually pull those arguments out of the %ARGS hash that mod_perl sets up for you. Why would you want to name your arguments un-legally, you ask? Well, just for starters, the form input element <input type=``image'' name=``clickable''> will pass arguments clickable.x and clickable.y to the action url automatically. It's part of the HTML standard, and you can't do anything about it.

So, <%args> and %ARGS can't be exactly the same. Future releases of Mason may include special <%args> to %ARGS mapping features, and the nature of such hypothetical features has been discussed on the mailing lists:

http://forum.swarthmore.edu/epigone/mason/dwehblerdshil

Why does Mason display the wrong line numbers in errors?

Due to limitations in the parser, Mason can only display line numbers relative to object files. Error reporting will be cosmetically improved in the 1.0 series and line numbers will hopefully be fixed in 1.1.

How can I manipulate cookies?

You can use the helpful modules Apache::Cookie and CGI::Cookie. It's also fairly easy to roll your own cookie-manipulation functions, using the methods provided by the $r global.

One thing to avoid: the combination of CGI::Cookie, Apache::Request, and POST requests has caused people problems. It seems that Apache::Cookie and Apache::Request make a better pair.


SERVER CONFIGURATION

Can I serve images through a HTML::Mason server?

If you put images in the same directories as components, you need to make sure that the images don't get handled through HTML::Mason. The reason is that HTML::Mason will try to parse the images and may inadvertently find HTML::Mason syntax (e.g. ``<%''). Most images will probably pass through successfully but a few will cause HTML::Mason errors.

The simplest remedy is to have HTML::Mason decline image and other non-HTML requests, thus letting Apache serve them in the normal way. The following line, placed in the handler() subroutine of your handler.pl,

  return -1 if defined($r->content_type) && $r->content_type !~ m|^text/|io;

declines all requests with a content type not starting with ``text/''. This allows text/html and text/plain to pass through but not much else. It is included in the default handler.pl.

Another solution is to put all images in a separate directory; it is then easier to tell Apache to serve them in the normal way. See the next question. Or, you can configure Apache to handle different filename extensions differently; see the next question, for that, too.

For performance reasons you should consider serving images from a completely separate (non-HTML::Mason) server. This will save a lot of memory as most requests will go to a thin image server instead of a large mod_perl server. See Vivek Khera's performance FAQ for a more detailed explanation.

How can I prevent a particular subdirectory from being handled by HTML::Mason?

Suppose you have a directory under your document root, ``/plain'', and you would like to serve these files normally instead of using the HTML::Mason handler. One solution is to use a Location directive like:

  <Location /plain>
    SetHandler default-handler
  </Location>

More generally, you can use various Apache configuration methods to control which handlers are called for a given request. Ken Williams uses a FilesMatch directive to invoke Mason only on requests for ``.html'' files:

   <FilesMatch  "\.html$">
     SetHandler perl-script
     PerlHandler HTML::Mason
   </FilesMatch>

Or you could reverse this logic, and write FilesMatch directives just for gifs and jpegs, or whatever.

Another solution to these kinds of problems is to put the abort decision in the handler.pl handler sub. For example, a line like the following will produce the same end result as the <Location /plain> directive, above.

  return -1 if $r->uri() =~ m|^/plain|;

Why am I getting 404 errors for pages that clearly exist?

The filename that Apache has resolved to may not fall underneath the component root you specified when you created the interpreter in handler.pl. HTML::Mason requires the file to fall under the component root so that it can call it as a top-level component. (For various reasons, such as object file creation, HTML::Mason cannot treat files outside the component root as a component.)

If you believe the file is in fact inside the component root and HTML::Mason is in error, it may be because you're referring to the Apache document root or the HTML::Mason component root through a symbolic link. The symbolic link may confuse HTML::Mason into thinking that two directories are different when they are in fact the same. This is a known ``bug'', but there is no obvious fix at this time. For now, you must refrain from using symbolic links in either of these configuration items.

With Mason 0.895 and above, if you set Apache's LogLevel to warn, you will get appropriate warnings for these Mason-related 404s.

Some of my pages are being served with content type xxx instead of text/html. How do I get HTML::Mason to properly set the content type?

HTML::Mason doesn't actually touch the content type -- it relies on Apache to set it correctly. You can affect how Apache sets your content type in the configuration files (e.g. srm.conf). The most common change you'll want to make is to add the line

  DefaultType text/html

This indicates that files with no extension and files with an unknown extension should be treated as text/html. By default, Apache would treat them as text/plain.

I have a line in my handler.pl to stop HTML::Mason from processing anything but text files, but I want to generate a dynamic image using HTML::Mason. How can I get HTML::Mason to set the correct MIME type?

Use mod_perl's $r->content_type function to set the appropriate MIME type. This will allow you to output, for example, a GIF file, even if your component is called dynamicImage.html.

Can Mason support multiple component roots, with searching, ala Perl's @INC?

As of Mason 0.8, yes! Consult the ``Component roots (multiple)'' section in the Administrator's Guide:

  http://www.masonhq.com/docs/manual/0.8/Admin.html#component_roots_multiple_

How do I connect to a database from Mason?

The short answer is that most any perl code that works outside Mason, for connecting to a database, should work inside a component. I sometimes do draft development and quick debugging with something like:

  <%once>
  use DBI;
  </%once>
  <%init>
  my $dbh = DBI->connect ( blah, blah );
  ...
  </%init>

The long answer is, of course, longer. A good deal of thought should be put into how a web application talks to databases that it depends on, as these interconnections can easily be both performance bottlenecks and very un-robust.

Most people use some sort of connection pooling -- opening and then re-using a limited number of database connections. The Apache::DBI module provides connection pooling that is reliable and nearly painless. If Apache::DBI has been use'd, DBI->connect() will transparently reuse an already open connections, if it can.

The ``right'' place to ask Apache::DBI for database handles is often in the handler.pl handler subroutine. Georgiou Kiriakos writes:

  You can connect in the handler.pl - I find it convenient to setup a
  global $dbh in it.  You just need to make sure you connect inside
  the handler subroutine (using Apache::DBI of course).  This way a)
  each httpd gets it's own connection and b) each httpd reconnects if
  the database is recycled.

Regardless of whether you set up global $dbh variables in handler.pl, the static sections of handler.pl should set up Apache::DBI stuff:

  # List of modules that you want to use from components (see Admin
  # manual for details)
  {
     package HTML::Mason::Commands;
     use Apache::DBI;
     # use'ing Apache::DBI here lets us connect from inside components
     # if we need to.
     # --
     # declare global variables, like $dbh, here as well.
   }
  # Configure database connection stuff
  my $datasource = "DBI:blah:blah";
  my $username = "user";
  my $password = "pass";
  my $attr = { RaiseError=>1 ,AutoCommit=>1 }; 
  Apache::DBI->connect_on_init($datasource, $username, $password, $attr);
  Apache::DBI->setPingTimeOut($datasource, 0);


PERFORMANCE

Is Mason fast?

It is typically more than fast enough. 50-100 requests per second for a simple component is typical for a reasonably modern Linux system. Some simple benchmarking indicates that a Mason component is typically about two to three times slower than an equivalent, hand-coded mod_perl module.

Beware of ``Hello World!'' and other simple benchmarks. While these benchmarks do a good job of measuring the setup and initialization time for a package, they are typically not good measures of how a package will perform in a complex, real-world application. As with any program, the only way to know if it meets your requirements is to test it yourself.

In general, however, if your application is fast enough in pure mod_perl, it will most likely be fast enough under HTML::Mason as well.

How can I make my Mason application run faster?

The first thing you can do to optimize Mason performance is to optimize your mod_perl installation. Consider implementing some of the tuning tips recommended in mod_perl_tuning, which ships with every copy of mod_perl.

If your application still needs to run faster, consider using Mason's caching methods ($m->cache and $m->cache_self) to avoid regenerating dynamic content unnecessarily.

Does Mason leak memory?

We are not aware of any inherent memory leaks in Mason itself. If you do find a memory leak that is traceable to Mason, please check the known bugs list to make sure it hasn't already been reported. If it hasn't, simplify your handler.pl and offending component as much as possible, and post your findings to the mason-users mailing list.

Of course it is always possible for your own component code to leak, e.g. by creating and not cleaning up global variables. And mod_perl processes do tend to grow as they run because of ``copy-on-write'' shared-memory management. The mod_perl documentation and performance faq make good bedtime reading.

If you are using RedHat's mod_perl RPM, or another DSO mod_perl installation, you will leak memory and should switch to a statically compiled mod_perl.