Today I came across some problems while creating some French content for a new website. The French language includes characters which require the web page's charset to be set to utf-8, or must be converted to appropriate html entities (see this excellent list of HTML entities for French sites).
I edit my html files using vim on windows. I have setup vim so that it could edit utf-8 encoded files (including chinese characters), and save them as such. So I wrote the accents directly in the html code, without using html entities. When I uploaded to the site, Firefox displayed question marks inside black diamonds wherever these characters should be.
I tried different solutions, but one of them worked particularly well in my case:
The site in question is Joseph SARL, a simple shopfront site. I built it using a templating system in wrote in PHP5, and which I used for other small sites. All pages are called from index.php, which then includes the requested page (e.g. index.php?page=contact). The idea was to use a couple of PHP functions to convert the special characters to html entities automatically. So, instead of including the html file I decided to fopen, then fread it, and pass its contents through the htmlentities() php function, as follows:
<?php
class template {
public function printPage() {
$handle = fopen($this->page, "r");
$contents = fread($handle, filesize($this->page));
return htmlentities($contents);
}
}
?>
<?php
class template {
public function printPage() {
$handle = fopen($this->page, "r");
$contents = fread($handle, filesize($this->page));
return htmlspecialchars_decode(htmlentities($contents));
}
}
?>
<?php
eval('echo "test";');
// Equivalent to <?php echo "test"; ?>
?>
<?php
class template {
public function printPage() {
$handle = fopen($this->page, "r");
$contents = fread($handle, filesize($this->page));
$contents = eval("?>" . htmlspecialchars_decode(htmlentities($contents)) . "<?php ;");
return $contents;
}
}
?>