So I'm still working on paying off my Technical Debt. There's a lot to figure out, things that need to get updated to work with newer systems that make life easier... and I'm getting there. Slowly, but surely, I'm getting there.
I've followed a fairly strict practice of versioning in the past. I've followed the spirit of Semantic Versioning, without even realizing it, though technically only about 90% of the way there. Most of this has to do with committing all the time, where I don't always update the version.
The problem has reared it's ugly head because I've recently changed my methodology for handling dependencies. With Subversion, I would "simply" use externals (svn:externals) to pull in external libraries using a fairly strict versioning system. Unfortunately (or maybe fortunately), each application was very much linked to a very specific version number. Technically each project should have been compatible only to the Major + Minor version (not all the way down to patch level).
Currently, I'm handling dependencies using Composer and Packagist (here's my public profile on Packagist). This way of dependency handling allows me to use automatic continuous integration testing systems like Travis-CI (I can't link to my profile, since it just goes to a blank screen, so search for "crazedsanity"), so I can be even more confident that my code will work on other systems. Oh, and I can put cool images up like this, which shows current and VALID values for build status on my projects:
(I totally understand if your eyes jumped to that grid right away)
Anyway, I've made significant strides toward code test coverage. That is to say that I've added unit testing to my systems in a more pronounced way, striving for TDD (test-driven-development), so that I can get to a point where I'm fairly confident in what ways any new changes will affect existing code. This will also help determine whether I need to release the code as a patch, minor version, or even a major version.
Hopefully this article gives you some insight into what I'm working on. Maybe it'll even give you some ideas, or help you, or encourage you to help me.
I've found some new shinies. Since I've started using git for version control on GitHub, I've found a lot of very cool things. Things that are available because of the coolness which is GitHub. Things that automate stuff for me, things that make my life simpler and more secure.
Here's the list. I will post some information about them in the future as I have time. For now, here's the things I've run into:
It's been a whirlwind of coding changes, but I've gotten three libraries working with Travis-CI, to the point that they're now automatically tested whenever a commit happens, so I know if I broke something. The testing isn't very complete, and I found quite a bit that's out-of-date, but it's at least a start.
Oh, and did I mention how automated it was? YEAH.
Earlier this year, I posted a message about my goal: I want to run a business or entity (or whatever) where developers come when THEY want to build stuff. It's paraphrasing a bit, but that's because I worded things in that post a little strangely. But what has been holding me back? Is it my code, or... what?
Back in the day, when I open-sourced a bunch of my code, I had to make some arbitrary decisions regarding licensing. I didn't know how I should license these libraries, or how I was going to use them, I just knew they needed to get out into the light of day. At the time, it seemed like the best licensing scheme was to go with the GNU Public License (or "GPL").
To be honest, I haven't really thought about that decision much for many, many years, until just recently. At work, I've been working with a new client to build an application for them. Suffice it to say, it was determined that this particular application would be a PHP-based web application. I was very excited, as I very much love to build web applications in PHP... that's when it hit me.
We talked about some implementation concerns, about implementing frameworks, when Darkman revealed that we wouldn't be able to use frameworks that were GPL'ed (that is, licensed under the GNU Public License). I realized immediately that this meant I couldn't use my own code the way it was, and we agreed to use CakePHP... and that my libraries would probably never be used for any commercial application if they remained under the GPL.
During the course of reading about CakePHP, I realized that it was their licensing that lead to such seemingly wide-spread adoption: they used the MIT license. It's a very open license, which allows it to be used for commercial applications without having to be open sourced themselves.
And finally we get to the point: I've dual-licensed my core libraries/frameworks: both under the GPL and MIT licenses. That means they can be used in commercial applications and wider adoption.
Without further ado, here are the libraries and the corresponding versions that have been dual-licensed (more will be updated in the future):
Prophet and I had a conversation this week, and he asked me what I wanted to do. I'd told him all about the web applications/ideas I'd been working on or planning, such as:
There were probably a couple of others that I can't think of right now. There's a lot of things I have sitting in the wings that I haven't discussed with anybody.
Anyway, Prophet basically asks me what my goal is. So here's my answer:
I don't care about building something like Facebook, or something that competes with some other system... I want to be the place developers come to when THEY want to build something to compete with these other systems. I want to have the answer to being able to build a cool web application quickly.
Don't get me wrong, though: I still want to build an application that's kick ass. I want to be able to say that I built a world-class CMS web application, a Project Management system, and other stuff too.
So, I've got a lot of code that I've written over the years. Of course, I'm listed as the maintainer, even though a lot of the code hasn't really seen an update in the better part of four years... until recently.
I've been considering doing a lot of things to them, but I really haven't had a reason to. Back when I originally wrote them, I was using them all the time. Then I open-sourced them, and continued to maintain them (some to a much greater degree than others).
As I had less reason to use the libraries, they began to stagnate. The list of bugs grew, and the number of commits shrank. Developers came, showed a vague amount of interest, then disappeared (see also my "Failure" entry).
The project management application, also one of my creations (well, a rewrite of somebody else's creation), faded into obsolescence. Occassionally I was reminded because spammers found it and began auto-posting comments on the issues... but then it once again faded.
Well, as I'm gearing up to work on these things again, I realize that I'm having to pay the technical debt. A lot of the things on that page really rang true for me. And there's a hell of a lot of debt to be paid.
To hopefully get a bit more exposure, I've moved all of the open feature requests and bugs to the associated issue trackers on GitHub. If you were associated with any of the old issues, you're probably aware of it (though I can imagine some of the messages went into spam folders).
There is a configuration management system out there called "Puppet." It's a very cool system that basically allows configuration of servers on a massive scale.
In comes Vagrant. It basically creates pre-configured virtual machines, or virtualized servers (basically the same thing). It's setup so that anybody can build a "base" virtual machine that future machines can be built on. It has scripting abilities so that the machine can be customized after it is initially created. In other words, there is a way to fairly quickly build a massive number of virtual machines, with each one having their own set of requirements.
So we've got Puppet, which can configure systems enmasse. We've got Vagrant, that can build virtual machines enmasse. With these tools, I can create a ton of virtual servers, and ensure that all of them are configured a specific way (and continue to be configured that way).
The servers eventually (if not from the word "go") get their own special configuration file. This file ensures that the servers have certain programs installed, users present (or purposely missing, in the event that a sysadmin leaves).
So... let's imagine an environment with a thousand servers, all with their own Puppet configuration. If one server goes down, a replacement can be built in just a few minutes, and fully configured just a few minutes later, fully prepared to take over the server that went down.
In fact, a replica of any of the existing systems can be built on demand in just a few minutes. A virtual replica of a production server can be built for a developer to run on their machine for testing.
It's freaking cool. Words cannot express my excitement.
I've noticed a very disturbing trend as I was going through emails today. The theme, if you couldn't guess from the title, is failure.
They're not ground-breaking changes. In fact, if it all goes right, none of my normal users will even realize there's been a change at all.
I'm working on updating my code libraries to... well, to be better, and safer.
For those interested (if you've read this far, that means you are), I'm changing the database abstraction layers to use PDO. It's turned into a massive rewrite just to be *partially* backwards-compatible, so I've concentrated on only dealing with PostgreSQL.
Once I've rewritten the main library, I'll end up having to rewrite my website's code to handle the non-backwards-compatible changes. And I'll have to decide what all is going to stay, such as the news feed on the main page: right now, that is being run by an old library called "rssdb" or something like that, which basically takes news feeds and dumps them into a database. It was cool, at the time... but I think it's outlived it's usefulness.
I've been told by others (including my wife) that they read the news on the front page, so apparently I'll either need to update that library or find a simpler/different way to get the news. Maybe I'll skip the part about writing it to the database, since I don't really utilize any of that anyway.
Aaaaaanyway.... so this big change has been coming for a long time. I'm still struggling with whether it's more important to be backward-compatible, or to just get it done. The former means that dependent code won't have to undergo quite as extensive of rewrites... the latter means that the darned thing might actually get done. I'm trying to limit the amount of work I have to do in order to implement something that probably should have been done from the start.
Oh, and to throw another wrench into the works, I'm looking at converting my webserver from Apache to nginx (that's "Engine-X"), because it's a lot faster.
Ugh. Thanks for reading (especially to my wife).
Back to coding.
Stored Procedures (Database)
AJAX (in general)
It's been a long time coming, but I'm starting to get back into serious code development. It's time to get back on the horse.
"What have you been doing," you ask? Well, the excuse I'm going with for now is that I've been trying to get other developers to help me out.
I have been unsuccessful so far. I've reached out to the open source community through SourceForge.net, and I've even got a few inquiries, but nothing substantial. I have friends that have expressed interest, but we're all having a problem getting motivated (curse you, Diablo III).
I'm still hoping to get others interested. I'm not sure how, and I'm about done with begging.
To get others to notice, I'm pushing things to more places. Repositories are being mirrored on GitHub, so I'll be pushing changes there and to SourceForge.net's SVN repositories. It will probably be cumbersome, but I'm also hoping to automate it a bit.
Should you be expecting new stuff from me now? No. Interested in getting notified if I do? Go to my page on GitHub and follow me. Or go to SourceForge.net and do that.
After talking with friends and colleagues about it for a while, I've finally started using Git instead of SVN (Subversion) for source control of my software.
I guess the main reason for using Git is that it's an order of magnitude faster. Not as simple, and there's a lot of things I still need to figure out, but it is a LOT faster.
Also, with the use of GitHub, all the code is easy to find. I can probably make a copy of it over at SourceForge.net as well... but I have to figure some things out first. I've gotta work on how to build software releases and such, along with some minor changes now that it isn't using some of the special features of SVN.
So... yeah. I'm not sure if any of this made sense. I'm not feeling all that well, my brain feels kinda cloudy, so I'll leave it at that. Maybe I'll make a post later with more details or something.
I've got a lot of code-stuff I'm working on. It seems like Darkman is going to start helping, at some point (instead of just continuously asking why I'm not using X or Y), which will be good... but I need goals.
The major goal at this point is getting CS-Project version 2.0 off the ground. Version 1 is okay, though it's very stale, not very web-two-point-oh-ish, and generally kinda clunky. And the last version, v1.1.5, was released in the middle part of '08, which is quickly becoming more like four years ago. FOUR YEARS.
Okay, so the big goal is to have a new release of CS-Project. There are a lot of libraries that have been built to help out, but I need something short-term to get me off & running. So, here goes:
Okay, there's a good list. Now I'm off to figure out how to accomplish that.
I make code. Other people use my code, or at least look at it. The frequency and volume of updates on those projects will sway people toward or away from using them.
So here's where you (or somebody else) comes in. I need somebody that's willing to read some emails and turn them into something a little more public-oriented.
The emails are generally very cerebral. I need somebody to read it, know what it means, and make it more publicly consumable. They're automatically sent to mailing lists so others can read them, but I need them in the for of news, such as for all the sets of code that they affect: for instance, CS-Project utilizes several libraries, like cs-content, cs-webapplibs, cs-phpxml, and probably some others. I need somebody that will collect those emails and work with me to regularly publish news updates.
Do you know someone that could help? Please let me know! Comment on the post, or go to the Contact Us page to let me know!
Sometimes it is tough to stay focused. Especially when it gets down to getting caught-up in the minutia, the little details that turn out to take just as long (or longer) than the entire rest of the project.
Last week I was working on an idea for COMET. The basis was long-poll Ajax requests which discovered events from a database event log. This particular log is one used by my "CS Web App Libs" project (web application libraries). There's a class/library that covers logging things to a database, along with some (mediocre, mostly untested) retrieval code.
Okay, so I did all the Ajax stuff, got the PHP script on the backend to go into a loop that could get down to loops that were just 100ms (that's how long it waited between polls; I didn't check to see how long the query took). The query would ensure results were greater than the last retrieved log ID (a unique, always-incrementing number) to ensure that no results were duplicated...
That's when I realized that this database logger (cs_webdblogger) couldn't limit results like that. It seemed like it was because there was a list of fields to filter by, and the "log_id" wasn't one... so I added it.
Uh-oh. I then found that the code that builds the SQL string (to poll the database) purposely would not allow special characters (such as ">"), so I couldn't do a query which had "log_id > 12345"...
So I started looking at that code. I realized it was only built to built a string out of an array. There were "patches" done to make it work for SQL, but they were kinda shoddy. So then I started thinking of building something similar, but specifically for SQL.
This will be the, what... 3rd step back? Fourth? So I started working out how to change that code so that it would do the ENTIRE SQL STATEMENT (instead of just segments). Including the table joining and what-not.
That's when I stopped. Four steps back, maybe five... But I decided to circumvent the "string from array" code and add an argument to a method, add the " log_id > 12345" part, and move on.
And you wonder why I have problems staying focused?
AND THEN... I post a blog about it. And realize that, because of the length of my new blog, it's tough to see if there are other new blogs by other people... so I start thinking about how to have an "expand" option, and how to stylize it... ugh.
And, once again, I've lost focus.
Okay, I've now done some very rough, preliminary testing on my ideas about COMET. And the results are impressive.
First, let's look at my current test environment.
I've got a VM (Virtual Machine) running Apache + PHP. It's Debian Linux 6.0.3, with Apache 2.2.20 and PHP 5.3.3. It has ~512M RAM and a single 2.4Ghz processor. Nothing fancy.
The host machine (what the VM is running on) is a Lenovo laptop. It's Ubuntu Linux 11.10, with PostgreSQL 9. It has ~3G RAM and a dual-core 2.4Ghz processor. Nothing fancy there, either.
To make things interesting, the VM's web application connects to the host's database. Not a great solution, but it was spawned from a problem with PHP/Apache faulting on the host machine. I didn't have time to deal with it, so I created the VM and ran the test site from there. If anything, this (combined with the fact that there's a VM in the mix) would hinder response times.
The application, currently only in testing, is pretty simple. Upon startup, it performs an Ajax "key validation" request: this is an idea right now, but basically ensures that the current user (which must be logged-in) only has one instance of the application open. It then immediately sends a "long poll" request to the server, which returns after ~60 seconds or when new data is retrieved.
The data in question is simply some navigation logs. Whenever someone views the website (a copy of CrazedSanity.com), new logs occur; the application would return those new logs and then immediately poll for new ones.
I'm actually extremely impressed with the results, especially given the complex (or maybe convoluted) environment. I loaded 4 pages at once (in Firefox, I opened four pages, the right-clicked and selected "reload all tabs"), and both sides were extraordinarily fast. Most of the delay seemed to be in the browser itself. The long-polling request returned immediately after I selected to reload the tabs, and immediately sent another request.
So, the next steps are to put it on the test webserver, which lives on the same server as the one you're looking at. That will give more difinitive, real-world results on how fast it is. Stay tuned!
As you may have read in my previous blog, I'm working on some ideas for building real-time web applications. I just finished a meeting with Prophet and Darkman, and I'm excited.
Darkman wasn't as interested in my ideas as I'd hoped, but I don't think I expressed them completely. They're still in the idea stage, though, which doesn't really help... sometimes it's tough to express ideas, especially technical ones.
So, my goal for next week's meeting: I'm going to come up with the fundamentals of the "CS-Comet" system. I need to have:
I think that's a good list. Now I'm off to start writing stuff down and maybe throw some code out there.
I'm trying to build a very responsive web application. Something that can handle "real time" chatting and so forth, but at the same time something that won't require a massive infrastructure to handle. I've used Ajax, and it works to a point, but doesn't really scale very well.
I did a very simple test to see if a standard Ajax-based system would work. I created a "ping" test, where each client (browser) that connected would do a very simple request to the server, one right after the other, to see how quickly clients could get updates. I saw the server's load was immediately affected. When I scaled it up to 5 clients, there was a very significant increase in the server's load... which made me realize that it simply would not scale well at all.
Trying to Scale
So I've looked into some alternatives that will resolve this scalability issue.
The "Comet Programming" technique seemed to be a way to overcome this obstacle, so I started looking for pre-built solutions. The "APE" project (Ajax Push Engine) seemed to be the answer, especially after reading the documentation and the easy-to-understand comic explaining standard Ajax versus Comet (or APE, which is a form of Comet).
Now, the reason for all of this is mostly for a web application I've built that has been nicknamed "TTORP" (Table Top Online Role Playing; a.k.a. "Battle Tracker" or "cs-battletrack"). It's a system where people that like to play the old pencil-and-paper table top RPG games can use to make things faster. Storing character sheets, and (eventually) chatting & having a screen that shows what's happening in a battle in real time.
So... back to the Comet thing.
I tried working with APE and a few other systems. They were very complicated, to the point that I didn't really know how they worked at all. I've worked with Darkman (long-time friend and associate) to figure this out, but it just never worked for me.
Then, one day when I wasn't even thinking about it, a light suddenly turned on (like in the cartoons). I read a bunch of things online about others trying to implement a Comet system (specifically with PHP), but there wasn't a pre-built framework available... and I realized that I already had 99% of the necessary libraries for PHP already done! With a little planning, I realized that I could built this framework into one of my existing libraries, "CS Web App Libs."
Building My Own Solution
So... the way I plan on implementing my own Comet framework is simple (where "simple" is used pretty loosely). The premise behind the system is that everything will log important events to the server's log using the "webdblogger" system built into "CS Web App Libs." The backend system will have a "long poll" script that will scan for changes: it'll start into a loop, looking for new changes every second; it'll go for 60 seconds, and will either return results immediately when some are found or once that number of seconds has elapsed.
On the front end (browser), there will be two different request types: "LP" for Long Polling, and PITR for Point In Time Requests. The LP "thread" will be dispatched right away, and will be running for the duration that the page is open. The server will hold on to this request, only returning when the allotted time (60 seconds) has expired or new information is available. The PITR will fire whenever the user does something (e.g. clicking a button).
How It works...
The idea is basically that the "LP" will be listening for changes. The page that is requested by the LP is built to wait for scan for events, so it can be ultra-responsive without requiring the browser to send a flood of requests (see the APE comic on this). There would be just two requests if an event happens in a minute (at 30 seconds, it returns, then another request is fired off to watch for more), whereas no polling means there would be 60 requests (one per second). Without polling, there would be 60 requests, and probably still a delay: if the event happens right after a request is returned, there could be a full second delay, whereas LP would catch it right away.
So here's a chart that basically explains this "conversation" over a 5-minute period using LP:
This new blog is built specifically for sharing information about the projects I'm working on. It is different from the Developer's Corner, which is for sharing information for a specific project.
I've got a lot of code I'm working on at one time. Right now it's just me, though in January (I'm hoping) Prophet will be joining, and potentially Darkman shortly thereafter. I've come to the realization recently that people who are interested in some of my stuff, like the web-based project management application "CS-Project", don't really know what's going on: I was approached by a company that wanted to take over the code base because they thought it was abandoned.
So this will be where I explain the bits I'm working on and how they connect. With CS-Project, there's a lot of other libraries I'm working on that link to it, but that aspect isn't really covered well. Nobody seems to realize that the "CS-Content" framework is a big part of it, or that there's an XML library ("CS-PHPXML") that is required, or that another framework ("CS-WebAppLibs") is involved. They're all important pieces to the CS-Project puzzle, but others don't realize how inter-connected all those bits are.
So this is my attempt to expose those connections. To explain where I'm going with things on a (hopefully) regular basis. So I can refer to entries here when people ask things like, "why isn't there something on your website so I can change my password and stuff?" Yeah... did you know CrazedSanity.com is a test bed for nearly EVERY code library that I work on?
Here's a list of (at least some) of the code projects I'm developing on:
Want to know more? All the libraries are on SourceForge.net... click the link. Do it. I dare you.