[[ch-webreference]]
== Website Configuration
(((Website, Configuration)))

This is a detailed break down of all the configuration options and
files available when configuring website hosting for a domain.

Throughout this chapter, as with the rest of this documentation, the domain
+my-brilliant-site.com+ is used as an example.

All configuration for the domain +my-brilliant-site.com+ will be performed
inside the [directory]'/srv/my-brilliant-site.com/' directory.

The Bytemark Symbiosis project uses the popular Apache HTTPD software for
serving your websites, and this comes complete with PHP5 along with many
of the most popular PHP extensions.

=== Getting started

All the files required for a website for the domain
*my-brilliant-site.com* are kept in
[directory]'/srv/my-brilliant-site.com/public/htdocs/'.

 * If this directory does not exist, a *404 Not Found* error will be
   returned.
 * If this directory exists, but is empty, then a default page is
   shown.
 * The index file can be written in glossaryterm:HTML[] or
   glossaryterm:PHP[], and should be called 'index.html' or
   'index.php' respectively.
 * Once this directory is present, both http://my-brilliant-site.com and
   http://www.my-brilliant-site.com will show the same content, i.e.
   there is no need to name the site with a *www* prefix.
 * If different content is required for
   http://www.my-brilliant-site.com then that should be put in
   [directory]'/srv/www.my-brilliant-site.com/public/htdocs/'.

[[s-web-cgi-scripts]]
=== CGI scripts
(((Website, CGI scripts)))

If you wish to use CGI scripts for your domain, then simply copy them
to a directory named [directory]'cgi-bin/' beneath the
[directory]'public/' directory.  They must all be marked as
executable.  This means setting the permissions to *755*.  In
@FileZilla@, right click the file and select [menuitem]|File
Permissions...| from the menu.  The file should have *Execute*
set for the owner, group, and public permissions.

For example, for *my-brilliant-site.com* the scripts would live in
[directory]'/srv/my-brilliant-site.com/public/cgi-bin/'.
(((public/,cgi-bin/)))

Any *executable* files in that directory will now be treated as CGI
scripts for your domain. For example if you created the file
'/srv/my-brilliant-site.com/public/cgi-bin/test.cgi' This would be
referred to as: http://my-brilliant-site.com/cgi-bin/test.cgi

[[s-web-statistics]]
=== Statistics
(((Website, statistics)))

Each hosted website will have visitor statistics automatically
generated and accessible at http://my-brilliant-site.com/stats/. These
statistics will be updated once per day, and the raw access logs will
be made available as
[directory]'/srv/my-brilliant-site.com/public/logs/'.

(((Website, statistics, disabling)))
These daily statistics can be disabled by creating the file
'config/no-stats'.
(((config/,no-stats)))

For example, for *my-brilliant-site.com*, creating the file
'/srv/my-brilliant-site.com/config/no-stats' will ensure that
statistics are no longer generated for that domain.  If you wish to
remove any existing statistics, remove the directory
[directory]'/srv/my-brilliant-site.com/public/htdocs/stats/'.
(((public/,htdocs/,stats/)))

(((Website, statistics, customising)))
It is also possible to customise the statistics generated by editing
the file 'config/webalizer.conf'.  This file is
documented at http://www.webalizer.org/[the Webalizer project
website].
(((config/,webalizer.conf)))

If there are many sites on the same machine, then it is possible to
customise all the sites' Webalizer configurations by editing the
template that is available at
'/etc/symbiosis/apache.d/webalizer.conf.erb'.  Configuration files
will be updated when the statistics are next generated, but only for
sites whose configurations either do not exist, or have not been
edited by hand.

[[s-web-testing-new-sites]]
=== Testing new websites
(((Website, testing new sites)))

You can view new websites before any DNS changes are made.

For example, if the virtual machine *example.default.bytemark.uk0.bigv.io* is hosting
*www.my-brilliant-site.com*, i.e. the directory
[directory]'/srv/my-brilliant-site.com/public/htdocs/' has been
created, then the website can immediately be viewed at
http://my-brilliant-site.com.testing.example.default.bytemark.uk0.bigv.io.

There are some important things to note though:
    - There is no *www* part added to the domain name -- it is just
      the directory name prepended to
      *.testing.example.default.bytemark.uk0.bigv.io*.
    - This testing alias isn't guaranteed to work in all cases, for
      complex site setups it might not work entirely as expected.
    - The testing alias only allows the testing of websites. Therefore
      FTP logins, email delivery, or checking is explicitly
      unsupported.

=== Displaying the same content under two domains

In this scenario, you have registered two domains for example
*my-brilliant-site.com* and  *my-brilliant-site.co.uk*, but you want the same content
to be served at both addresses.  There is no need to create two
separate directory structures, you can just set up one directory
structure and then create a soft link (aka symbolic link or symlink)
to the second.

[procedure]
- Once the my-brilliant-site.com directory structure has been completed, log
  on to your machine as admin over SSH.
- Run the command `ln -s /srv/my-brilliant-site.com /srv/my-brilliant-site.co.uk`

This creats a symbolic link of 'my-brilliant-site.co.uk' pointing at 'my-brilliant-site.com'.  
Now browsing to my-brilliant-site.co.uk will show the same content that appears at
my-brilliant-site.com.

[[s-preferred-website-domain]]
=== Redirecting to the preferred website domain
(((Website, redirecting to a preferred hostname)))

If a document tree were created in
[directory]'/srv/my-brilliant-site.com/public/' then that site would be
available under two hostnames:

  * http://my-brilliant-site.com/
  * http://www.my-brilliant-site.com/

There are people who prefer to use only a single name, and to
automatically redirect visitors using the _wrong_ name to using the
preferred name.  This can easily be achieved by using Apache's
mod_rewrite facility.

If you prefer all visitors see the www-based site you could create the
file '/srv/my-brilliant-site.com/public/htdocs/.htaccess' with the
following contents:

-----------------------------
RewriteEngine on
RewriteCond %{HTTP_HOST} !^www.*$ [NC]
RewriteRule ^(.*)$ http://www.%{HTTP_HOST}/$1 [R=301,L]
-----------------------------

This examines each incoming request, and if the hostname doesn't begin
with "www." then it is prepended to the request and a redirect is
issued.

[[s-web-custom-configuration]]
=== Custom Apache configuration
(((Website, Custom configuration)))

It is perfectly possible to alter the way Symbiosis configures Apache,
either for an individual domain, or for all domains hosted on the
server.

Symbiosis hosts sites on a server in one of two ways, based on the IP
address that site has configured.  If it uses one of the server's
primary IP addresses, then it is assumed that the site is hosted using
the "mass-hosting" configuration.  If the site has a secondary IP
assigned then Symbiosis generates an individual snippet for that site,
and Apache is configured to use that snippet when dealing with HTTP
requests for that domain.  Both configuration techniques are
configured using a template, which allows the server's administrator
to fiddle with, and tweak the configuration.

In [directory]'/etc/symbiosis/apache.d/' there are a number of
templates that are used to generate configuration snippets for both
the mass-hosting, as well as individual sites.

=== Logging
(((Website, access logs)))
(((Website, error logs)))

By default, access requests for each site on a machine will go to
'public/logs/access.log'.  If the site has SSL enabled, the request
logs will go to 'public/logs/ssl_access.log'.  These logs get rotated
once a day, and compressed after two days.

The error logs for a site will go to one of two places, depending on
how the site is configured.  If the site has its own SSL certificate,
or otherwise has its own IP address, then the error logs will go to
'public/logs/error.log', or 'public/logs/ssl_error.log'.  Otherwise
the error logs will go to
'/var/log/apache2/zz-mass-hosting.error.log'.

Finally, if a request is received for a domain that is not present on
the box, then it is logged to 'zz-mass-hosting.access.log' if it
received on the primary IP of the machine.  If the request comes on
any other IP then it is logged to 'other_vhosts_access.log'.  Both of
these last two files are located in [directory]'/var/log/apache2' .

//
// This layout section should always be last.
//
[[s-web-configuration-layout]]
=== Web configuration layout
(((Website, configuration layout)))

Here is an example configuration layout for the domain
+my-brilliant-site.com+, all of which is contained under
[directory]'/srv/my-brilliant-site.com/'.

'config/no-stats':: If this file exists, no statistics will be
generated for this domain.  Existing statistics in
[directory]'/public/htdocs/stats/' will not be removed automatically.
(((config/,no-stats)))

'config/ssl-only':: If this file exists, traffic will be redirected
to the SSL version of the website.  This will also configure your site
to use
https://en.wikipedia.org/wiki/HTTP_Strict_Transport_Security[Strict
Transport Security].
(((config/,ssl-only)))

'config/webalizer.conf':: This is the Webalizer configuration file
for this domain.
(((config/,webalizer.conf)))

[directory]'public/cgi-bin/':: This is the
directory which may be used to hold <<s-web-cgi-scripts,CGI scripts>>
for your domain.
(((public/,cgi-bin/)))

[directory]'public/htdocs/':: This is the
directory from which content is served for the URLs
http://my-brilliant-site.com/ _and_ http://www.my-brilliant-site.com/.
If this directory does not exist visitors will be shown an error page.
(((public/,htdocs/)))

[directory]'public/htdocs/stats/':: This directory will be
automatically created, if it isn't already present, and updated with
statistics referring to the number of visitors to your website.
(((public/,htdocs/,stats/)))

'public/logs/access.log':: This file contains the Apache webserver
access log for the domain.  It will be archived daily, and removed
after 30 days.
(((public/,logs/,access.log)))

'public/logs/ssl_access.log'::  This file contains the Apache
webserver access log for the domain when accessed over SSL.
(((public/,logs/,ssl_access.log)))

'public/logs/error.log'::  This file contains the Apache webserver
error log for the domain, if the domain has been configured to run
under its own IP address.  It will be archived daily, and removed
after 30 days.   If the site does not have its own IP address, then
errors are logged to '/var/log/apache2/zz-mass-hosting.error.log'.
(((public/,logs/,error.log)))

'public/logs/ssl_error.log'::  This file contains the Apache
webserver error log for the domain when accessed over SSL, if the
domain has been configured with its own IP address.
(((public/,logs/,ssl_error.log)))


