Möödunud reedesel pinu.ee kokkutulekul sai tõstatatud küsimus:
Kui palju Eesti foorumites üldse programmeerimise küsimusi küsitakse, ning milline on pinu.ee positsioon võrreldes konkurentidega?
Et sellele ise vastata võtsin ette 7 Eesti populaarseimat programmeerimisfoorumit ning uurisin kui palju uusi teemasid on nendesse selle aasta kahe esimese kuu jooksul püstitatud.
Mõningase wgettimise ja greppimise peale sain järgmised tulemused:
Vaid php.ee, phpcenter.eu ja pinu.ee on puhtalt programmeerimisfoorumid. Ülejäänutes arutatakse programmeerimise teemadel mõnes alamfoorumis. Hinnavaatluse puhul on kokku võetud Programmeerimise ja WWW alamfoorumid (vastavalt 25 ja 53 uut teemat kummaski). Teistes on arvestatud vaid ühte alamfoorumit.
Muidugi on Eestis veel mitmeid teisigi foorumeid kus programmeerimist käsitletakse, kuid ma ei leidnud oma uurimise käigus ühtegi millesse vaadeldaval ajavahemikul oleks rohkem kui kaks-kolm uut teemat püstitatud.
Kokku on kõigis neis foorumites kahe kuu jooksul loodud 336 uut teemat. Et nendele lisanduvad veel üksikud teemad vähempopulaarsetes foorumites, võime selle numbri vabalt ümardada 350 peale.
Seega postitatakse Eesti foorumitesse keskelt-läbi 350 / 2 = 175 küsimust kuus.
Selle õnnetu analüüsiga aasta statistiku auhinda vast ei võida, aga vähemasti peaks olema selge millisest suurusjärgust me Eesti kontekstis räägime.
Lisaks uutele teemadele, arvasin kokku ka 2010 jaanuaris-veebruaris üleüldse tehtud postituste hulga. Postituste arv = uute teemade arv + vastuste arv. Kõik detailid on alljärgnevas tabelis:
| Foorum | Uusi teemasid | Uusi postitusi | Postitusi teema kohta |
|---|---|---|---|
| php.ee | 127 | 1333 | 10.5 |
| hinnavaatlus.ee | 78 | 897 | 11.5 |
| phpcenter.eu | 64 | 163 | 2.5 |
| koffer.ee | 34 | 183 | 5.4 |
| pinu.ee | 15 | 97 + 63 | 6.5 + 4.2 |
| vision.pri.ee | 10 | 69 | 6.9 |
| torufoorum.net | 8 | 45 | 5.6 |
| Kokku | 336 | 2850 | 8.5 |
php.ee postituste arv oleks veelgi suurem kui ma poleks maha arvanud hulka postitusi, mis olid kõige puhtakujulisem spämm.
pinu.ee on natuke erijuhtum. Esimene number tähistab klassikaliste postituste arvu, teine aga kommentaaride arvu.
Suhtarvud on siinkohal üpriski huvitavad. Esimeses kahes foorumis püstitatakse ohtralt uusi teemasid ning iga teema saab ka palju vastuseid. Phpcenter torkab selles arvestuses silma kui võõrkeha - uusi teemasid on pea sama palju kui Hinnavaatluses, vastuseid saab iga teema aga laiduväärselt vähe.
Pinu paistab aga toimivat hästi - uusi teemasid on küll vähe, ent need vähesed genereerivad palju vastuseid ja kommentaare.
Kirjutatud 15. märtsil 2010. Kommentaarid (1)
ExtJS comes with some heavy documentation, all generated directly from source code, which contains doc-comments like this:
/**
* Get the index within the cache of the passed Record.
* @param {Ext.data.Record} record The Ext.data.Record object to find.
* @return {Number} The index of the passed Record. Returns -1 if not found.
*/
indexOf : function(record){
That's of course the good-old javadoc format, used by many-many languages nowadays, and the comment above looks perfectly readable. So, what's the problem?
The problem is then when you need to add some structure to your
comment, you have to use HTML. Just to start a new paragraph, you
have to use <p> and when you want to add even more structure, say a
list, the whole thing will get truly ugly:
/**
* Find the index of the first matching Record in this Store by a function.
* If the function returns <tt>true</tt> it is considered a match.
* @param {Function} fn The function to be called. It will be passed the following
* parameters:<ul>
* <li><b>record</b> : Ext.data.Record<p class="sub-desc">The
* {@link Ext.data.Record record} to test for filtering. Access field values using
* {@link Ext.data.Record#get}.</p></li>
* <li><b>id</b> : Object<p class="sub-desc">The ID of the Record passed.</p></li>
* </ul>
* @param {Object} scope (optional) The scope (<code>this</code> reference) in which
* the function is executed. Defaults to this Store.
* @param {Number} startIndex (optional) The index to start searching at
* @return {Number} The matched index or -1
*/
findBy : function(fn, scope, start){
This comment is barely readable, and that's just one of them - ExtJS
contains hundreds, if not thousands of comments like this. Comments
formatted with HTML aren't easy to read, they aren't event easy to
write. Just look at that example - in one place <tt> is used to
mark up code, in another <code> is used. It even contains class
names! So why? Why do they use HTML?
Sure, javadoc uses HTML, but nobody is going to run this code through javadoc. There's no need to be compatible with javadoc. Take for example PHPDocumentor - a tool for parsing doc comments in PHP - they support creating paragraphs and lists without HTML.
The only reason I can see, is that ExtJS has grown out of YUI, which also uses doc comments with HTML. But ExtJS isn't dependent from YUI, so there is no reason for the documentation to be.
But enough of bashing ExtJS, how could we make it better?
So, instead of using HTML, why not some other markup languge that's easier to read and write. Markdown is a good candidate for this role. (I'm using Markdown right now to write this blog-post). So, let's write the above doc-block using Markdown:
/**
* Find the index of the first matching Record in this Store by a
* function. If the function returns `true` it is considered a
* match.
*
* @param {Function} fn The function to be called. It will be
* passed the following parameters:
*
* - **record** : Ext.data.Record
*
* The {@link Ext.data.Record record} to test for filtering.
* Access field values using {@link Ext.data.Record#get}.
*
* - **id** : Object
*
* The ID of the Record passed.
*
* @param {Object} scope (optional) The scope (`this` reference)
* in which the function is executed. Defaults to this Store.
* @param {Number} startIndex (optional) The index to start
* searching at
* @return {Number} The matched index or -1
*/
findBy : function(fn, scope, start){
That's way more readable! Isn't it just obvious that this is the right solution!
It's also really easy to implement: Markdown implementations exist in several languages, it's just a matter of running the text-parts of doc-comment through a Markdown formatter. It's actually so easy, that it took me just one evening to add Markdown support for one of our internal documentation tools (it parses doc-comments of PHP files and generates HTML documentation for our AJAX API).
But there's more to ExtJS documentation...
Look at the doc-block for Ext.data.Store.Error:
/**
* @class Ext.data.Store.Error
* @extends Ext.Error
* Store Error extension.
* @param {String} name
*/
Ext.data.Store.Error = Ext.extend(Ext.Error, {
This sure looks like a violation of the DRY principle. Isn't the class and it's parent obvious from the code? Couldn't the doc-comment parser figure it out by himself?
Here's my version:
/**
* @class
* Store Error extension.
* @param {String} name
*/
Ext.data.Store.Error = Ext.extend(Ext.Error, {
Of course you can have situations where this information isn't easily extractable from the code. If that's the case, then it should be possible to explicitly define classname and parent classname in doc-comment, but it shouldn't be mandatory.
Another example is events:
/**
* @event datachanged
* Fires when the data cache has changed...
* @param {Store} this
*/
'datachanged',
Or config options:
/**
* @cfg {Boolean} autoSave
* <p>Defaults to <tt>true</tt> causing the store to
* automatically {@link #save} records...
*/
autoSave : true,
And then, surprise-surprise, somehow they got it right with properties:
/**
* See the <code>{@link #baseParams corresponding configuration option}</code>
* for a description of this property.
* To modify this property see <code>{@link #setBaseParam}</code>.
* @property
*/
this.baseParams = Ext.isObject(this.baseParams) ? this.baseParams : {};
This last example makes it pretty obvious that all that repetition can be easily avoided. It's just that... why isn't it avoided then?
It's bad that ExtJS doc-comments suck. It's not only the ExtJS team members who suffer, it's the whole community. It's unwritten standard that when you publish ExtJS extension, you should document your code in ExtJS-way. But when you try to do it, you will suffer. And so, many authors just don't bother.
Programmers don't like writing documentation, it's a nuisance. You have to make it really-really easy, or it just won't be done. That's why we have the doc-comments in the first place - to bring documenation as close to the code as possible. But you aren't really making documenting easier, when writing it involves some clumsy markup language. You really don't want to write your comments in XML (another great idea from Microsoft). HTML isn't much better either.
Is there any hope for a change? It doesn't look like ExtJS is going to do anything about it. (Well, nobody really knows what plans they have.) So our only hope is to do it by ourselves, for example by extending ext-doc to support Markdown inside comments.
I have been thinking about this for a long time. Maybe it's time for me to take some action...
Kirjutatud 14. märtsil 2010. Kommentaarid (2)
Note that I'm not just repeating what Free Software Foundation has been advocating for ages, I'm streching it even further. Namely, that normal users are going to download programs in source code, configure, make and install. All software being open source will just be a small side-effect of all this.
Isn't compiling your own programs just a harder and slower way to get to the same binary?
Not quite. When you download a general-purpose binary, it will be compiled so that it runs on a whole range of processors. This means that it will be compiled for the worst processor out there. Event when your CPU is the latest and greatest, the binary doesn't make use of its fancy instructions, because it also needs to run on some old processor that doesn't support these instructions.
But when you compile your own, you can tell the compiler to optimize the program only for your processor, taking advantage of all those new nifty instructions.
When you want to go beyond one processor architecture, you just can't get away with a single binary. Today already you need at least 32-bit and 64-bit versions. Want to support Windows, Linux, BSD, OSX - you'll need about ten different binaries. When distributing in source code, you just need one version that can be compiled for all these architectures - but that's what you need to have anyway!
Binaries are the cause why we are still stuck with x86 architecture. Because most of the software is distributed in binary form, processor manufacturers are forced to stick to the old architecture - otherwise all these binaries will stop working on the new machine.
If software were distributed as source code, processor architectures would be free to evolve. Processor manufactorers would just need to supply an updated version of a compiler and all the source code would nicely be compiled to work with the new processor. It's not that we would then have bazillon processor architectures, but we wouldn't be locked in to a single one.
Even given all these goodnesses, going fully to source code might still sound quite insane, but please, bare with me while I go through several counter arguments.
Sure, the way things currently are, it's the sacred land of hard-core Linux-aficionados. Even many programmers are a bit scared of typing:
# ./configure
# make
# make install
...not to mention compiling your own (dare I say it) kernel!
But things don't have to be this way. Gentoo is a good state-of-the-art example of compiling made easy. You can compile whole KDE desktop environment by just typing:
# emerge kdebase-meta
And it will download, compile and install all the KDE packages and their dependences.
Sure, Gentoo is widely considered a hard-core distro, but it's not really the compilation part that scares most people away, it's that all the installation and configuration is done from command line. But the compilation itself is way easier than in most other distros.
It's more of an interface issue. There is no reason why one couldn't create a source-based distro with user-friendly graphical package manager.
Ubuntu takes about 15 minutes to install. Gentoo takes several hours. Nobody is going to put up with that!
But computers are getting faster. Gentoo install used to take days - these days I can get KDE running within one. Multi-core CPU-s don't make most programs faster, but compilation is a highly parallelizable task and here every additional CPU (or core) helps.
Another part of the problem is that C/C++ is notoriously slow to compile:
The way that C++ header files work with a standard compiler, you can wind up re-parsing the same file hundreds or thousands of times. So even with a really fast compiler, you can easily wind up with some extremely slow compile times.
That's exactly the problem that Go language is trying to fix:
Go compilers produce fast code fast. Typical builds take a fraction of a second yet the resulting programs run nearly as quickly as comparable C or C++ code.
Armed with fast processors and compilers, there is no reason to be slowed down by compilation times. Additionally for many programs of the future, compilation time will not be relevant at all, because the code will be interpreted or JIT-compiled.
This is just a widespread misconception:
Once upon a time I compared the average sizes of binary compiled programs and their source code in C, since the size of the binaries would presumably be highly compressed and efficient. If you stripped comments, the source code was on the average about 1/2 the size of the binary code - though of course it executes slower. YMMV, but no way was the source code enormously bigger. And as far as data is concerned, the number "10" takes a wasteful 16 bits to store, but the binary version takes an efficient 4 bytes.
Of course on this age of web 2.0 and rich internet applications we can't get away without mentioning JavaScript. Although JavaScript is not compiled, a widespread practice is the use of JavaScript minification (which can be equated to compiling - at least google does). Because minification turns JavaScript into shorter JavaScript, it will always result in smaller files, at least by the amount of comments and whitespace.
But truth to be told again. More important than minification is that you serve your JavaScript files gzipped. Minification is just an extra bit that you should add to achieve maximum compression. A small test with jQuery proves this:
161K jquery-1.4.2.js
71K jquery-1.4.2.min.js
45K jquery-1.4.2.js.gz
24K jquery-1.4.2.min.js.gz
This extra that minification gives you is already not really needed by most web apps. Almost nobody besides Google minifies HTML any more. Soon it will be the same with JavaScript.
What about it? You can always obfuscate your source code when you really want to. But I think in the long run most of the software will also become open source - the GPL is programmed to do it, it's just a matter of time.
One would guess that at least the installer of this imaginary OS would need to be binary. But not neccessarily... You could still download the installer as source code, compile it, and then burn the resulting binary to install-disk.
Or embedded devices. You wouldn't think of compiling on your phone? Like you wouldn't think of browsing the web from it? Just a matter of good-enough hardware.
Or what do you think?
Kirjutatud 23. veebruaril 2010. Avalda oma arvamust.
RSS, RSS kommentaarid, XHTML, CSS, AA