Developer's Journal: December 2007

Monday, December 24, 2007

Merry Christmas and Happy New Year

Odds are I will not get around to writing another blog again this year. As with everything else, I will be busy with family meetings and the like. So I will just wish all my readers Merry Christmas and a Happy New Year.

I will see you all again in 2008 :)

Friday, December 21, 2007

On Version Numbering

I recently found an article on the blog of a IT magazine that brought up the issue version numbering on open source software projects.

The magazine: Tietokone Lehti
The blog post: Versionumeroinnin Vaatimattomuudesta

Summary
Both are only in finnish, so I will summarize here. As the title clearly tells "On the Modesty of Version Numbering" the author has a low regard for the tried and true way of handling version numbering in open source projects. He brings up the idea that if software producers were not so shy about increasing the version number it might lead to the adoption of more open source software or at the very least reflect more positively on non-technical people.

While I recognize the point and agree with the author on a fundamental level, in practice I really do not see why this should be.

What Is Version Numbering
It seems to me that the author, along with a great load of other people, have either forgotten or never even knew what version numbering actually is. It is a strictly technical term that refers to the precise version of the software. The reason why it often is a sequence of digits and dots is because just starting from 1 and moving up from there isn't very efficient. Mind you that there are some horrific version numbering systems out there. However the dominant version seems to be the one started by the Linux kernel: N1.N2.N3, where the purpose of the numbers goes Major, Minor, Bugfix respectively. A major release is considered the largest and is most likely to break backwards compatibility. Minor changes include new features and additions to the API, but is often considered to be backward compatible. The last is simply a new update that features bug fixes. Often you can also download a nightly build version which is experimental and unstable.

1.0 is often viewed as a point where you have successfully implemented every feature on your roadmap, without any serious bugs.Sometimes even this is redundant since programmers are a practical bunch upgrading the version number from e.g. 0.4.9 to 1.0 just for the same upgrading sounds quite strange to us.

Product/Brand Names
I think the primary reason for confusing in the matter is that people are looking for version numbers to also serve a purpose as brand or product names. They have come to expect this from proprietary softwtare vendors where names look really sexy, like Windows Vista or Adobe Indesign CS3. But that is all that they are; cool looking names that have almost no relationship to the actual version of the software. Looking deeper into your program, usually in the About window, you will find the true software version. For my Adobe Acrobat this is 8.1.1, instead of just 8 like it says on the product name.

Behind the Scenes
The reason why proprietary software are eager to get to number 1 is two fold. First of all, they understand that 1 is sexier than 0.1. Secondly, they often do not publish programs until they have completed their entire roadmaps. Hence they reach 1 first, while open source programs go by the mantra, "release early, release often". Thus it does not matter to us that the program is not fully finished. In fact, if it was, people could not collaborate nearly as much on any one project, which is of course what open source is all about.

Unfortunately this has been traditionally interpreted as a sign that open source software is inferior or lower quality than proprietary software. This is not the case. There is always a clear distinction between an unstable and stable release. Unstable releases are development versions that are used primarily by the actual developers and experimentalists. You should expect them to fail and crash often. Stable releases are just that, stable. They are meant for the larger audience and should crash rarely or never.

Wednesday, December 19, 2007

PHP 5.3 Upgrade: parse_ini_file

I recently started working on a small package to read and write ini files. On the most part PHP does the work for me, as far as parsing ini files goes. Unfortunately parse_ini_files is a bit too intuitive in its approach as it converts some of the original values into other values. These are namely the boolean values. To make matters worse there is not a lot of consistency in the way they are handled. To name an example, there is a different outcome when you use "yes" and "no" values in your ini files: "yes" is converted to "1" but "no" is converted to just "" (empty string) instead of "0" as you would expect.

Because of this behavior, when you read the parsed data from the resulting array you can never know exactly what was in the ini file. Was the original value "yes" or just a numeric value 1. While not an impossible dilemma, it is a bit confusing.

This has been fixed fairly nicely in PHP 5.3, where they added a third argument to the parse_ini_files function, called "scanner_mode" which is represented by the two new constants; INI_SCANNER_NORMAL and INI_SCANNER_RAW. The normal mode is the default and it means the function will work as it always has. But using the raw mode returns the parsed data as-is. It means all section and key names, as well as, values are returned as they exist on the file. This allows you to convert the values into native PHP types more robustly and make a distinct difference between yes/no, 1/0, and true/false values.

At the time I am writing this, the update has not yet reached the PHP manual but it is mentioned in the NEWS file if you download the PHP 5.3 beta.

Tuesday, December 11, 2007

Ready, Willing, and Able

After seeing the site I mentioned ye, Function A Day, got me thinking of something.

You often hear experienced PHP programmers say that there is a mentality among PHP coders to only use a small subset of PHP's capabilities. I know this is true from my personal experience where I have written a functionality I needed, small or large, and later found that there already was something very similar implemented as an extension that came bundled with PHP.

The reason this happens is that we are not always aware of all the possibilities that PHP offers. In fact, I find it amusing that some programmers are puzzled by this; often contributing it to inexperience and outright sloppiness. While I do not doubt that is often the case , I do doubt that it is the principle reason though. The simple fact is that there is just too many darn functions in PHP to keep track of and more are added all the time. Most of the time it is actually easier to just write the piece of code you need than to browse Google and the PHP manual looking for a specific functionality. I would even argue that it is a weakness in PHP that there are so many functions.

Another good reason not to always rely on PHP's functions is that they may have annoying defects or they simply do not behave the way you want them to. The final reason to code things for yourself is just to keep yourself writing code. With an abundance readily available functions and classes, with out-of-the-box solutions, it is easy to get lazy and rely too heavily on them. This damages a programmer's own skill to write basic things from scratch. Personally I find it helpful to understand the underlying principles of a PHP function before I use it and the best way for me to do that is to write the function myself.

In my mind, the solution to this problem is to use PHP's wide array of functionality but not obsessively. Remember to rely on your own skill and imagination when solving problems. If there is a quick fix in the form of a PHP function, by all means use it. It is faster and more efficient but do not spend an hour pouring through the manual when you could have written your own solution in just ten minutes.

Function a Day

An interesting new piece of news appeared on Zend's Developer Zone just a few days ago. A new PHP site has been opened, by Paul Reinheimer and Courtney Wilson and it is called Function a Day.

I find the idea interesting and indeed I spent some time browsing it today. The first time I heard of it the site was not up yet, so that is why the delay in posting this. The idea is simple, post a new PHP function once a week; explaining what it does and how it works. A great idea, in my opinion, since it is not uncommon for people to redundantly write PHP code to perform actions that there is a well defined and explain function for in PHP's core library. I know I am guilty of that sometimes myself too even.

In on itself, the site is elegant, beautiful, and it does the job very well. It is not a tutorial site per say, which is good since we already have those tons around the internet. Still, the function in question is usually explained in less documentary fashion than PHP's own manual, which by definition can be a bit tedious to read sometimes.

For anyone interested, I suggest taking a look.

Wednesday, December 5, 2007

Open For Comments

I just came to be aware of the fact that only people with Google accounts can comment my blog. While this makes a certain amount of sense I am also certain that a lot of people do not have Google accounts. I would not have one either if I had not opened this blog. For this reason I have now opened the commentary for all users. This includes anonymous commentary. All comments are however moderated to help cut down spamming.

Tuesday, December 4, 2007

Database Abstraction

I discovered an interesting view of database abstraction some few days ago when I was browsing for a blogger service to register into. It was titled The Illusion of Database Abstraction Layers or Classes. My first thought was "oh dear..." and my second was that I was so going to write about that once I got my blog up and running.

According to the author "DB abstraction layers will mostly hide the specific interface semantics between the application and the database." This is true of course since it is the very definition of abstraction to hide complicated parts of the program from the user. That is why they are so useful because you the programmer do not have to worry about the specific of the database implementation and can concentrate on what is more important, such as pleasing your customer.

But what the author seems to have completely forgotten is that there are a great many levels of abstraction and most of them are not complete. And indeed this is a good thing. It is not the library's mission to do everything for you but simply to make it easier. I know, like most programmers who have worked with a lot know, it is a painful process to build an interface that talks with multiple DBMSs. But the fact that the database abstraction library does not support it does not mean that it does not work. It only means that it works to a certain point, which is not entirely the library's fault because, like the author mentions, compatibility between DBMSs is loose at best and horrifying at worst.

Personally I have never really understood why people have an obsession to write applications that run with at least half a dozen DBMSs anyway. I am sure it looks great on promotions when you can say that a product runs on Mysql, PostgreSql, MsSql, Oracle, and whatever DBMS you could think of but does the product really need it? I am going to go out on a limb here and say that in fact most products do not need such support. I have known a large number of applications that were both popular and did well in business and which did not support more than two DBMSs at most. If you think about it carefully, the more DBMSs you support the less you will support them all. Because like the author mentions, the compatibility between the DBMSs is very bad. If you support only a small set of DBMS, lets say two at most, you can better take advantage of the features that they have and make your product all the more better.

It is the old rule of business to target your products to as a specific audience as possible, instead of the whole world. This is slightly harder in the open source world because by tradition open source software is developed with a specific audience in mind and hence they must be as flexible as possible. However I would still give open source developers the same advice. A targeted audience is better than a wide audience. Do not worry about those that you cannot please. Someone will eventually cough up a similar software to yours that will those who you could not target.

Saturday, December 1, 2007

A Call for Pecl Developers

I ran into an interesting podcast just yesterday. A new series in the PHP Abstract podcast, hosted by Elizabeth Marie Smith called Pecl Picks where she introduces the basics of Pecl and it's history and suggests anyone interested in developing pecl packages. A similar call was put out by Wez Furlong in an open e-mail letter that can be found in the php news site.

It is titled PHP Extension authors - please join the PECL community.

I find the idea itself interesting. In fact I have toyed with the idea of writing an extension several times during my time as a php developer. Usually I have come to the conclusion not to write or take the time to get to know the art of php extensions. The reasons have been two fold.

First of all I have usually struggled with trying to find a worthy project to devote myself to. Where as I could pick up an dead extension or write a smaller trivial one, I would prefer to write something useful. I am all for experimenting and writing small code packages often it does not seem worth it to write an extension when you can write what you need in php. I know extensions are obviously faster and more reliable but the stretch to take it up has always seemed too high.

The reason why it is so high is also my second reason, which is the embarrassing lack of documentation and resources. The pecl web site hosts some tutorials, documents, and mailing lists but they are all but useless to new developers. Mailing lists are not the places to go to learn from step one and the tutorials and documents linked on the site either predate the inception of the light bulb or they are outright broken. If Wez is serious about wanting recruit more developers to pecl I would suggest he starts from producing more meaningful and helpful resources to the new developers.

Elizabeth did mention a good source to start from in the podcast, which leads me to think I should probably reconsider my disposition. Extending and embedding PHP is a book authored by Sara Golemon. It is a fairly recent book, last revision is from 2006 so it is probably up-to-date enough to be a good source. A quick glance at reader reviews in amazon.com support that the book is worth a read to anyone who wants to pick up writing php extensions. In fact, according to Elizabeth, it is a must-have.

I will see if I can find it in a bookstore and return to the subject at that time.

All and all, I found the podcast interesting and enlightening. I will be looking forward for the next episode, while I go browse for Extending and Embedding PHP.

Developer's Dogma

I figured I would start the Developer's journal with an overview of what programming is to me how I approach it. My favorite language to work with, without a doubt, PHP but I have been known to write in Perl, C, Java and C# in the past. For whatever purpose, no one can say. I have chosen PHP as my primary tool to work with because developing new and better web applications is what interests me. It is an ungrateful endeavor because a lot of application developers often look down on web developers. Many hardcore Java and C developers see dynamic languages as merely toys that kids play with. This attitude has lessened somewhat during the past couple of years with the advent of major web applications (other than the usual CMS, framework, or bulletin board software) like Google's web office and Microsoft's commitment to ASP.NET. The support of major corporations into the development of serious web applications has somewhat "redeemed" the web developer community.

However there are still many misguided people with strange and even utterly misconceived notions about dynamic languages and the programs that we write with them. I am not saying that everyone or that even the majority of programmers fall into this category. But the unfortunate truth is that these individuals still exist and they have an active impact on the community.

But I have always believed that it is important, especially to yourself, to stand up for what you believe in. Do not be afraid to have opinions and stand by them. There are fewer wrong opinions than right ones. In fact it is more important to have opinions than not have them at all. The world would be a lesser place if we were afraid to make a stand just because we were afraid of being prooven wrong.

This dogma has helped me tremendously over the years. I have been known to be wrong many times but my life is richer for it. When you are a programmer this kind of mantra is even more important because it can be a very unforgiving environment. Most programmers are not afraid to tell you what they think, wrong or right, and when there is a conflict of opinions (which is say ... often) at best they disdain at your presence and worst they despise you.

So why are there so many differences in opinions? Because programming is an active process that keeps developing. You can learn new things every day of your life and in every program there is always room for improvement. It is a perfectionist's nightmare. Then there is also the fact that programmers use different tools that behave different and most of us have styles and preferences that can differ like the sun and moon from each other. This is the reason why you should never be afraid to push yourself onward. Try new tools, new methods, new design principles, etc. You are likely to make more mistakes by trying new things but in the end you will also make better and more useful things that way. In a society is becoming more dependent on technology at every passing moment, this can only be a good thing.

So here in short is my developer's dogma:

Always stand up to what you believe in.
Do not be afraid to take risks and venture into the unknown.
By experimenting and talking with others in the field, try to recognize new and better ways of doing things.

Last but not least, always aim to have fun at what you are doing. :)

Developer's Journal