I just wanted to encode my own mail address on a static HTML page to prevent it from being collected by SPAM bots or crawlers. However the
tool I used for this vanished together with my old website. I searched for a similar tool in the Internet but after about 10 seconds I had an excuse
to make a new one on my own. :3
The HTML obfuscator basically replaces the normal characters with their respective HTML escape code. Markdown also uses this
approach and randomly mixes the decimal and hex representation of the character code and PHP Markdown takes it a bit further and
adds some scarce unencoded characters to the mix so I used this algorithm.
Just put your text (mail address, Jabber-URI, Skype name, etc.) into the tool and it will output the encoded version. Since the main usage are mail
addresses it will also output a mailto: URI as well as a complete mailto: link with the mail address as text.
I also played around with some special Unicode characters to break the long codes lines in a more pleasant way. The soft hyphen (U+00AD) is
displayed as a normal hyphen (-) if it's at the end of a line otherwise it's not displayed at all. The zero width space (U+200B)… well, is like
a normal space but can not be seen. Basically it's just an offer to the browser to break the line where ever you insert it. However if I would embed
such special characters into the code generated by the tool you would copy and paste them along with the code and this would surely confuse
some people. Therefore I abandoned this idea (even if it looked quite good) and used the usual overflow: auto scrollbars.
Anyway, I hope some people find this tool useful. If you have questions or some ideas just post a comment or drop me a mail.
Welcome to the third brand new version of my website. It's not only a face lift but also (again) a complete
rewrite of the entire website. I officially switched the page to English now and German posts are specially
marked since this fits my habits more. Theres also some new stuff like the tag cloud, the projects
page and a proper archive. I've also added some funny things like the blurred navigation
subtitles and some texts at the bottom of pages. So feel free to look around.
Ok, that was the obligatory introduction everyone probably skips, so right down to the interesting stuff.
Motivation
The old website was fairly sophisticated and had a lot of features, so why another rewrite? Basically
because it was sophisticated and complex. Maintenance also was a concern since Rails applications tend
to be a bit difficult to deal with in the long run. However the strongest point was simplicity.
Often complexity is added without properly thinking about it and I wanted to know how far simple stuff
can carry you. I'm not talking about the simplicity frameworks like Rails or CakePHP give you: this simplicity
is often bought by knowledge of a framework which tries to hide it's own complexity (Rails is very good
here). However if you look at it in the greater scale (e.g. like a good system administrator does) such a
framework adds a tremendous amount of complexity to your application which you always need and use
of course.
I wanted to get closer to a form of "absolute" simplicity. If there is nothing, nothing can break and you
have nothing to maintain. However I need at least something to write my stuff and the bare minimum for that
are a bunch of files: text files with the content and some additional files like pictures grouped into some
directories. This was the simplest way I came up with and much to my surprise it's all you need for a
basic website like this one. Add a few simple PHP lines to the mix to present these text files as nice HTML
and use SSH to create and edit the text files. That's it. It does not only sounds easy, it is easy.
The weblog files in Ubuntus file browser
In a time where many ditch SQL in favor for more scalable (and simple) key-value stores the move to
simple files looks a bit odd. At least to me it did. However if you think about it some time you realize that
this opens up many new options. Caching is basically as simple as with key-value stores (e.g. serialize and
store frequently used data) and if the disc I/O becomes a bottleneck you can move the most used ones
to a RAM disk. You can use version control for your content (e.g. SubVersion) or can do any crazy stuff you
can do with file systems (UnionFS, EncFS, …). It happened to me more than once on this
project that simplicity proved to be equally flexible and extensible.
Server side technology
Enough about why, lets go on the the how. For the text files I used the format that is used so often that
most programmers don't even notice it. No, I'm not talking about XML but a simple header and body
structure like used by mails (SMTP), HTTP and many other protocols.
A simple post file
Title: My post
Tags: test, files
And here comes the content.
These files are then read by some PHP pages. I didn't used a PHP framework… well, if you look at PHP
close enough it already is a framework. Thanks to the alternative control structure syntax
and short opening tags you get a nice template system for free. There are also some other features like
anonymous functions that keep the code clean as long as you keep it reasonable simple and
not to mention the large standard library.
In the greater scale I used some mod_rewrite rules to build nice and clean URLs that get rewritten to
a Rails like structure of PHP files. There's one PHP file for each action and it contains the logic code at
the beginning, followed by the view code (mainly HTML with some embedded PHP). And thanks to
output buffering it's easy to wrap everything into a layout. That are about the features that
I used most of the time. You get all this for free without any maintenance. Well, you have to install PHP
but thats just one package you can install along with Apache.
Design
The new design with the same test entries as in the prototype
The most visible change is a new design based on the Modern Ambience design prototype. However
the whole design was made completely from scratch directly as HTML5 and CSS3 code. I didn't opened
Inkscape once, simply because it wasn't necessary. The article Beautiful UI styling with CSS3 text-shadow,
box-shadow, and border-radius more or less was the spark it all started with. While the original
Modern Ambience prototype took several weeks to complete the current design was done in about 3 hours.
All thanks to the new CSS3 properties. However these properties also made this website a little browser
benchmark and you actually see a performance difference between Opera, Firefox and Chromium.
Writing the HTML code with the semantic tags of HTML5 that matched the content at hand was actually fun.
It just feels good if your markup gives you a good representation of the content. Thanks to CSS2 and 3 selectors
styling them isn't a problem and in the end I only needed one semantically useless div element in the layout.
I strongly recommend anyone building websites to check out the Beautiful UI article. It's
astonishing what's possible with CSS and it's actually faster than using an Image program like Photoshop or
Inkscape to design the website.
The endless details
As with every project there are a lot more details I could write about. Some interesting use of the table-*
values of the display property for the figures and their captions, the tag cloud, HSL-colors, the PHP class
used to load the files, simple caching functions, the comment markup, no foreign request policy etc. If you
're interested in some details feel free to post a comment. I'm willing to answer any questions. :)
As a last note: This project proved for me that simplicity is worth the time thinking about what you
need. So before the next project spend the few minutes and think about what you really need before
firing up your framework of choice.
That task doesn't sound to difficult, does it? Well, I though so to until I encoded an event recording
as an MP4 file. I did this in the usual way using mencoder and ffmpeg. Nothing special, no fancy
stuff. However, the first response was: the video doesn't play on MacOS X (using Quicktime). Next
response: It doesn't work on Windows (using Windows Media Player 12). Hm… both officially supported
MP4 with H264 video and AAC audio, but it just didn't work. I tested VLC, Totem and Mplayer on Ubuntu
Linux and VLC and Media Player Classic on Windows 7. Everything worked there without any problems.
Maybe I should have answered: Just install Ubuntu Linux, because everything works there, out of
the box. Because this is the usual answer I get if some fancy Windows or Mac stuff doesn't work on
Linux at once.
However here is the full technical story on how to create an MP4 video that plays on Windows Media
Player and Apple Quicktime. If you encode a Video you usually get something with an H264 video track
and an AAC audio track. Now this isn't the whole truth since H264 isn't always H264 and AAC isn't
always AAC. Both formats have different profiles where the "low" profile only allows simple features for
clients that can't afford to spend to much time decoding the data (like mobile devices where every long
calculation drains the battery). Then there usually is a "main" profile with a normal feature set and a
"high" profile with the most complex features.
In my case after encoding the video I got an H264 high profile (read: complex) video stream and an
AAC main profile audio stream. The iPhone can't handle the H264 high profile but since I wanted to
create a video for desktop computers it sounded suitable. However after a day of searching around I
finally found some other limitations:
Windows Media Player and Apple Quicktime can't handle the AAC "main" profile. They can only play
the "low" profile, called AAC-LC. If you pack in an audio stream
with the "main" profile there simply is no sound at all in these players. Thanks to Marc Seeger for the
hint about this.
Apple Quicktime only plays the entire file if the audio stream is marked as MPEG4 AAC. I
know, sounds strange, but AAC is also part of the MPEG2 standard, and therefore an MPEG2 AAC also
exists. In fact this is what mencoder and ffmpeg (using the FAAC encoder) usually output.
If you are aware of the profiles and these limitations it's not a big deal to build an MP4 video that works
in Quicktime and Windows Media Player. In this example I encode a small 10 seconds test clip recorded
during one of my projects.
First encode the video and audio stream, here with mencoder:
For clarity I've split the command over multiple lines, that's what the \ at the end of the lines is for.
The -vf parameter in the first line just filters the test.dv input video (deinterlace and filter out
noise) and is of no importance here.
The second line says x264 (an H264 encoder) to encode the video with a quality of 25 and
the default settings. This will create an H264 High profile video stream that will be stored in the test.avi
output file. Be sure to tweak the x264 options for your situation but here this would have only added a
bunch of confusing parameters.
The third line now is the interesting one. Here FAAC (and AAC encoder) gets the job to encode the audio
stream with a quality of 25. The object=2 parameter makes sure that we get an AAC-LC
stream which allows Windows Media Player and Quicktime to play the audio. mpeg=4 on
the other hand sets the output to MPEG4 AAC (otherwise MPEG2 AAC) would be the result. I found
these options in the mencoder man page (just search for -faacopts).
With this we made sure that our video and audio streams would play properly. Now we only have to
pack these streams into an MP4 container:
The details of this procedure are described in the mplayer documentation (also look there
if you want to pseudo-stream the file). I got some error messages on the way (about the H264 stream)
but I don't think they are relevant for this topic.
If we now take a look at the MP4 file with mp4info we get the following:
$ mp4info test.mp4mp4info version 1.6
test.mp4:
Track Type Info
1 video H264 High@3, 10.080 secs, 204 kbps, 720x576 @ 25.000000 fps
2 audio MPEG-4 AAC LC, 8.554 secs, 95 kbps, 48000 Hz
Tool: mp4creator 1.6
The important part is that we got an MPEG-4 and AAC-LC audio stream. With this the file should play
in Windows Media Player and Apple Quicktime. In case you want to try it out feel free to download the
encoded test clip.
Markdown is a pure text format you can easily write and convert to HTML. It has several nice features and one of
those is inline HTML. I one of my current projects (you will see it soon enought…) I'm combining PHP Markdown
with some of the new HTML5 elements:
Normal text…
<figure>
<figcaption>The appearance of FooBar</figcaption>
<img src="foobar.jpg" />
</figure>
Some more text…
Markdown wraps p elements around everything that looks like paragraphs. This would also apply to inline HTML code
but Markdown already has a build in list of HTML block elements. If it encounters one of those it's not wrapped inside a
p element. Because HTML5 is pretty new stuff Markdown don't know the HTML5 block elements yet. Therefore it wraps
the figure element inside a p element thus making the document invalid (and screwing up the styles).
Thankfully the PHP Markdown source code is very well documented and this made it a pice of cake to add the figure
element to the list of block level elements. In Line 1770 of PHP Markdown 1.0.1n the variable $block_tags_re
contains the list of block level elements:
# Tags that are always treated as block tags:
var $block_tags_re = 'p|div|h[1-6]|…|hr|legend';
I abbreviated the list a bit for clarity. Just add |figure to the end and you're done. After that modification
PHP Markdown stopped to wrap p elements around the figure elements.
Since HTML5 is no way near finished it is probably to early to add the HTML5 block level elements to the source of the PHP
Markdown project itself. However for the advantages of HTML5 I'm willing to live with such a small modification. :)
I was playing around with some CSS3 stuff lately and while doing so I noticed that Firebug seems
to be buggy as soon as you use some CSS3 rules like box-shadow:
A paragraph highlighted in Firebug
These funny blocks that look like parts of a picture are indeed parts of the header image. Looks like the margin and
padding highlight somehow displays the wrong stuff. Just as a reference, the above usually looks like this:
The same paragraph viewed normally
I'm sure this Firebug bug :) gets fixed pretty fast but it made me remember that Opera contains a similar
development tool: Dragonfly. It's not as hot and pretty as Firebug but the basic functionality looks pretty solid.
Viewed in Dragonfly with layout highlighted
After this experience I will fire up Opera a bit more often when doing design stuff (right now I use Opera only for
browsing). For the usual grunt work Firebug is absolute invaluable (alone the JS debugging …) but it's good to know
that there is an alternative at least for some situations. I like variety.
Some time ago I've written about the virtual machine I got for my video streaming and archiving
project events.mi. While the VM did really well during the last month´s some command displayed
some strange error messages.
man for example:
man: can't set the locale; make sure $LC_* and $LANG are correct
Special characters like German umlauts were also not correctly displayed on the console. This didn't
really was a problem until Subversion refused to operate because of a directory containing a special
character.
The quick answer
The locale data is installed but the needed locale is not compiled into the locale archive. You can do
this for the "de_DE.utf8" locale with the following command:
$ localedef -i de_DE -f UTF-8 de_DE
The long story
After a quick search I tried to reinstall or reconfigure the locales package but either helped to solve
the issue. Since it's obvious that the error is related to locales I did some digging in the POSIX Locale
spec. The LANG environment variable was set to "de_DE.utf8" so there was no problem with the settings.
However listing the locales showed something strange:
$ locale -alocale: Cannot set LC_CTYPE to default locale: No such file or directory
locale: Cannot set LC_MESSAGES to default locale: No such file or directory
locale: Cannot set LC_COLLATE to default locale: No such file or directory
C
POSIX
The POSIX spec states that the C and POSIX locales are just placeholders for the default locale.
However there is not a single "real" locale (e.g. "de_DE.utf8") in the list and therefore the default locales
point nowhere. A quick test of the setlocale() function in a small C program also showed that it
always returns NULL and therefore fails.
Now the question is, why are there no real locales installed? I stumbled across a forum post mentioning
localedef as a solution to the problem and the man page of localedef shows the
directories where the character maps (/usr/share/i18n/charmaps) and "raw" locales (/usr/share/i18n/locales)
are stored in a Linux system. These directories contained many character maps and locales but the "compiled"
locales in the locale archive (/usr/lib/locale/locale-archive) where simply missing. Therefore locale -a
returned only the placeholder locales and setting the "de_DE.utf8" locale failed.
Now localedef is the utility that compiles the raw locales and character maps into the locale archive.
So the only thing we have to do is compile the wanted locale:
$ localedef -i de_DE -f UTF-8 de_DE
Now the list of available locales looks like it is supposed to do:
$ locale -aC
de_DE
de_DE.utf8
POSIX
Now the necessary locales are in the locale archive and therefore all programs can use the setlocale() function
again to set the locale to "de_DE.utf8". Directory and file names are also displayed properly and Subversion can
handle special characters again.
I hope this detailed description of the error helps a few people solving similar problems. :)
Der 7. GamesDay ist nun endlich abgeschlossen. Alle Aufzeichnungen sind online
und es gibt nichts großes mehr zu tun. Endlich Zeit um stehen zu bleiben und etwas
zurück zu blicken.
Im Gegensatz zu dem 6. GamesDay hatten wir dieses mal deutlich mehr Zeit für
die Organisation (2 Monate statt 2 Wochen). Wir haben sie zwar nicht so gut ausgenutzt
wie beim letzten mal, aber dafür hatten wir alle noch etwas Leben neben den GamesDay.
Ich bin echt froh, dass dieses mal so viele bei der Organisation mit geholfen haben,
besonders zum Schluss hin währe es sonst nicht machbar gewesen. Viele haben wirklich
sehr gute Arbeit geleistet.
Alles in allem hat der GamesDay bei mir mit ca. 42 Std. produktiver Arbeitszeit zu buche
geschlagen (ohne Besprechungen und Zeit an der HdM) und damit das Asteroids-Projekt
(ca. 40 Std.) überholt. An für sich sind 42 Std. jetzt nicht so viel (knapp eine Woche normale
Arbeit), aber 42 Std. produktive Arbeit sind eine ganze Menge. Einige haben vielleicht
schon die Erfahrung gemacht, dass man an einem 8 Std. Tag nicht unbedingt 8 Std. produktiv
arbeiten kann…
Von den Vorträgen haben mich mal wieder die technischen Themen begeistert. Allen voran
"Game Engine Architecture" von Andreas Stiegler. Einfach
genial, Punkt. Ich werde mir seinen Vortrag noch einige male anschauen, sobald mein nächstes
Spiele-Projekt anläuft. Persönlich mag ich relativ realistisches bzw. physikalisches Gameplay
sehr gern und Andreas hat bei seinem Vortrag witzig und anschaulich erklärt, wie man genau
sowas angehen kann.
Ebenfalls sehr gut fand ich den Vortrag "Einführung in die Computer-Echtzeit-Grafik" von
Benjamin Thaut. Ein schöner Vortrag darüber wie man in der
3D-Grafik die Dinge auf den Bildschirm bekommt, so dass sie schön aussehen und sich auch
noch schnell genug bewegen können. Dieses Partikelsystem
vergesse ich nie wieder…
Alles in allem bin ich momentan sehr froh, dass der GamesDay vorbei ist. Auch wenn es Spaß
macht, ist es doch ein riesen Aufwand und als Organisator bekommt man beim GamesDay
selbst leider immer nur die Sachen mit, die eben gerade nicht funktionieren.
ps.: Wer nicht beim GamesDay war kann sich übrigens die Aufzeichnungen
anschauen.
Today I gave a talk about Non-negative matrix factorization, an algorithm used e.g. in dataming to find similarities
in a large number of documents. In this course (Dataming and pattern recognition) we use Python to implement
some nice exercises (spam filters, document clustering, face recognition, etc.) and as a preparation for the talk I
implemented some stuff in Python.
There were several aspects of Python I was looking forward to:
Program structure defined by indention
Operation or function syntax on variables
Speed (well, for an interpreted language…)
The first hours were quite funny (our professor gave a brief introduction). The documentation was a bit troublesome but
I guess I just didn't found the right one. The docs on the Python website are nice if you have enough time to actually
read it but I really miss a reference like the Ruby core documentation. There you have everything in one place and
can find almost anything with a simple browser search.
Program structure defined by indention
To use indentions to define the program structure also was nice at first. I really like the "executable pseudo code" thing
because I usually use indentions in my notes to structure algorithms. However when writing larger functions or classes I
had a hard time "scanning" the source code. The start of a specific code structure can be seen immediately but the end?
Maybe I'm a bit to much used to languages like C, Java or Ruby which have very clear end markers, but I spend much
more time in Python code actually searching the end of code block (function, method, …) than in other languages. It
feels like my eyes are hanging somewhere in between the lines, not really sure where to go next. Again, it's probably just
a matter of getting used to. Maybe a light background color for whitespace in gedit will help.
Operation or function syntax on variables
When doing some "theoretical" stuff with algorithms we often defined abstract data types. These are basically a list
of types involved, the operations that can be used on the data type and the behavior of the operations (usually expressed
in form of axioms). The operations of an array data type might look like this:
In the pseudo code of algorithms I use this "operation" syntax because I usually think first about what to do and second
about the data that gets manipulated.
I was looking forward to Python supporting this style of programming. However it looks like this operation or function style
of programming is currently replaced by object orientation. While this is not a bad idea the results look somewhat strange.
For example joining an array of words:
a = ["hello", "world"]
", ".join(a)
Why is the string ", " responsible for joining the array a?!
The second point is that join is a method. Or to say exactly, that you first have to think about the data that gets
manipulates (here the ", ", then a) and somewhere in between about what is actually done (join). To stay
in a world of operations you could write
join(a, ", ")
where the interpreter could look if the first parameter a defines a join method and calls it. This way you could
stay in the mental model of operations and it would also be consistent with the way you define methods in Python with the
first parameter self.
On the downside you would lose the ability to do neat method chaining. However in many languages with the ability it isn't
used anyway (e.g. Java). The syntax for creating a new instance of a class would also look somewhat strange:
new(Post)
given that new is an operation of the Postclass (like in Ruby).
Does Python really need another unremarkable form of object orientation (OO with a dot)? Why not spice up OO a bit to amplify
some unique ideas and mental models behind Python?
Speed
In this case Python really showed of. To be more exact, numpy.matrix. I did some rather extensive matrix multiplications with
relatively large matrices (e.g. 2000×500) and the performance was quite impressive. It calculated the results in seconds where
the Ruby matrix class took minutes to do the work (and is not really made for this kind ob job).
Anyway, these are just my thoughts about Python after playing with it for some days. It just feels like there are two directions
within the current Python language (operations and OO) and this looks a bit inconsistent from my point of view. But again, maybe
I just don't know the right Python features at the moment.
Yesterday I got access to a newly created virtual machine I can use for my streaming project. Initially I had doubts
about the performance of the VM because ffmpeg2theora is encoding a DV stream there.
However after a look at htop these doubts vanished instantly:
A screenshot of htop on the new virtual machine.
This is simply the best VM I ever had. I have never seen a CPU load of nan before… :)
Für alle Interessierten: Die Grafik der Tastaturbelegung findet man bei Ubuntu Linux unter System → Einstellungen →
Tastatur → Belegung → Hinzufügen …
Zu meiner großen Überraschung kann man dort z.B. die Symbole für die Pfeiltasten direkt finden (Alt Gr + z
z.B. ergibt ←). Das macht die Verwendung der Pfeile sehr angenehm. Dazu gibt es noch einige praktische
Zeichen mehr, die ich ab und zu verwende wie z.B. die Ellipse … bei Alt Gr + . oder diverse
mathematische Zeichen wie × bei Alt Gr + Umschalt + ,.
Die ganzen Zeichen funktionieren zwar nicht immer unter Windows, aber solange man in der Linux-Welt unterwegs ist, sind
diese Zeichen echt praktisch. :)
Thanks for scrolling down all the way, it can get quite lonely here…
Anyway, looking for older entries? Want to know more? Take a look at the archive.