Archive for the “Web Programming” Category

So, I'm working on a project where I want to assign unique IDs to some elements in a list. I did some research on GUIDs (Globally Unique Identifiers) and UUIDs (Universally Unique Identifiers), but the whole thing seemed a bit too overwrought for me.

See, my IDs only have to be unique in a list. That list will probably have, at most, 200 members. I have a lot less worry about collisions (generating and assigning the same ID to two separate items, thus causing confusion). There are a few GUID/UUID generator scripts out there that do all sorts of gymnastics to generate a 32-40 character ID so unique, there's supposedly no chance of it colliding with another ID of that length... ever.

The math bears that out to some extent. A 32-character ID using just A-Z and 0-9 (a.k.a. Base 36) has over 63 quindecillion, or 63 times a trillion times a trillion times a trillion times a trillion different IDs. Put another way, a 70 kilogram (approx 154 pounds) person has approximately 7 octillion atoms in their body, which is more atoms than there are stars in the universe at current estimates. 63.3 quindecillion is enough to give every atom in every person on Earth over a trillion unique identifiers... per atom... and that's counting the fat people too.

Still, there's a part of me that thinks that a 32-character code (36 with dashes) is just too dang long. I should program a function to generate a 6 character Base 36 code, because while that raises chances of collision from 1 in gazillions to one in 2 billion, it saves 30 characters worth of disk space and computational gymnastics. Since I only need the IDs to be locally unique to one file or one database table, I should program the quick function to generate the code (pick a random number between 1 and 2 billion, then use base_convert(picked_num, 10, 36) to change the random number to a Base 36 code).

But, I do realize that some of that mindset, while good because we should always seek the most efficient way to do things, also gives some chance of collision, requiring I write and use collision check/prevention code elsewhere in the script, so it gets balanced out. Additionally, that mindset is a bit of an old-timer mindset. I'm old enough to remember being excited when the price of hard drives dropped under $1 per megabyte, letting me get a 250 megabyte drive for $245 back in the 90s. Yesterday I saw a 1.5 terabyte external drive at Costco for $139.99. At that cost, 250 megabytes of that drive costs under 2.45 cents. Put simply, I was paying 10,000 times more per megabyte 15 years ago than I'm paying today.

Okay, since we're using an ASCII-safe character set and storing with UTF-8 encoding, it means the 36-character code (32 letters and numbers + 4 dashes) is going to require 30 more characters of storage. To be fair, we'll triple it for overhead. To use up $1 worth of extra disk space on that 1.5 terabyte drive, I'd need to store over 99 million IDs. In my project, I'm figuring a seriously heavy user might generate a few thousand over the course of a year. I don't really have to be sparing to control storage costs.

What about the computational cost, then? Efficiency is important. But, then again, how much does processing horsepower cost? Using the gigaflops (billions of floating-point operations per second) measure applied to super computers... Back in 1997, computing power cost approximately $30,000 per gigaflop in a system sporting two 16-processor Beowulf clusters with Pentium Pro microprocessors. Now, a desktop sporting a Core 2 Quad 8200 ($149.99 at Newegg when I last checked) delivers 37.28 gigaflops at a cost of $4.02 per gigaflop.

I decided to try a few methods for generating unique ID's. I tried a class that generates a 32-character UUID/GUID posted by Marius Karthaus in the comments of the uniqid() function documentation on php.net, the uniqid() function which generates a 13-character ID, then I tried picking a random number between one and a large number and converting it to base 36 using 2,000,000,000 and 9,000,000,000,000,000,000 (two billion and 9 quintillion). I topped out at 9 quintillion because when I went up to 90 quintillion, the result for the mt_rand() function was always 1. I generated 50 sets of 5,000 IDs for each and got the following average execution times per set (running on my local development environment - a MacBook Pro):

UUID: 0.16870986938477 seconds
Random 2 Billion: 0.021249299049377 seconds
Random 9 Quintillion: 0.024264307022095 seconds
UNIQID: 0.11121699333191 seconds

Computationally, if you factor the time to create versus the odds of collision, the UUID costs less than twice uniqid()'s method, but gives you a few gazillion times as many possible IDs (3.7 x 10^29 times as many). You'll cut processing time by a factor of close to 8 if you go with the one-in-two-billion method, and close to a factor of 6 if you go with the one-in-nine-quintillion method. Of course, even with the most computationally expensive method, generating more unique IDs than I expect any user to need in a year uses less than 2/10ths of a second.

So, if a REALLY heavy user completes enough tasks that they need 5,000 unique IDs, those IDs (assuming that data and overhead amount to 108 bytes) will use up 1/198th of a penny worth of disk space and 1/18,000th of an hour worth of computing time.

I still feel compelled through years of conditioning to try to squeeze every last bit of efficiency out of my code, but the fact that I spent hours researching this, testing it, and determining the relative efficiencies of the various methods means I probably cost myself more than all the computing and disk space savings combined among everybody who ever uses the script I'm working on.

  • Share/Bookmark

Comments No Comments »

So, the way I learn a new programming skill is I get an idea for a program I need to learn the language (or improve my knowledge of it) to create. Then as I learn the things I need to know, I toy with various ideas and do little experiments just to see what happens or what's possible.

This time I wanted to know if I could get a string name passed, then use it in some function using the eval() function to get its value. For example

var steve="blue";
var blue="danny";
alert(eval(steve));

Now, if it was just alert(steve), the result would be an alert window with the message "blue". But because eval is evaluating the content of "steve" as code... The value of "steve" is "blue" and the value of the variable "blue" is "danny", so alert(eval(steve)) generates an alert window with the message "danny".

But the cool thing is that you can nest evals...

<script type="text/javascript">

var steve="blue";
var blue="danny";

function gotoit(sedley){
alert(eval(eval(sedley)));
}

</script>

<a href="javascript:gotoit('steve');">Run The Dang Script</a>

In this case, the value of sedley is "steve", which evaluates to "blue", which has the value of "danny".

Try it for yourself.

I know, sort of pointless, but it was neat to me and I thought I'd share.

  • Share/Bookmark

Comments 2 Comments »

While working on a recent project, I found myself with a need to insert blocks of text via javascript and wishing I could do the heredoc style parsed multiline string. If you're not familiar with it, you might have a bit of code...

<?php

$fred = $_REQUEST["fredinput"];

$ted = <<<EOM
This is a multiline<br>
string, that outputs<br>
the value of "fredinput" as<br>
$fred.
EOM;

echo $ted;

?>

Now, let's say that in whatever form you'd posted to this script, the value of the "fredinput" variable was "Loquacious", the output of that bit of code would be:

This is a multiline
string, that outputs
the value of "fredinput" as
Loquacious.

Now, let's say you had large blocks of HTML you'd want to specify in this way for a javascript application. It gets a little painful because it all has to be a single line, or at the end of each line, you have to escape the line break, you have to make sure you escape your single or double quotes, and to put in any variables, you have to close your quotes, insert the variable surrounded by pluses, and then reopen the quotes. It makes that HTML hard to read and easy to screw up if you need to edit it.

Wouldn't it be nice to just type the big old block of text as straight HTML with lots of whitespace for readability, not have to escape anything, and just prefix variable names with dollar signs to have the variable's value appear?

You can, and all you need is this very short function. It's a bit of a klunky hack because the text is in a hidden div in the HTML document instead of within the script. But otherwise it works nicely.

Here's a sample using the function.

<script type="text/javascript">
// set variables named andy and handy, which we can use as $andy and $handy in our text

var andy = "Fred Flintstone";
var handy = "Steve Austin";

function hereDoc(divid){
var obj = window; // gets an object containing all the variables
var str = document.getElementById(divid).innerHTML; // gets the HTML block
for(var i in obj){

/* the for loop recurses through all the objects in the page - remember strings are objects in Javascript */

if((typeof(obj[i])=="string")||(typeof(obj[i])=="number")){

/* Type of makes sure it only executes replacement for strings and numbers. The function worked without this test in Firefox and Safari, but threw errors on Opera until it was added. */

myregex = new RegExp('\\$'+i,"g");

/* To replace globally, you need to use a regular expression and use the "g" option, but within the string.replace() method, the regular expression is unquoted, so you can't use a variable in it directly. So we create it and assign it to a RegExp object that works in the string.replace() method. */

str = str.replace(myregex, obj[i]);

/* we replace instances of the variable name with a dollar sign before it with the variable's value */
}
}
return str;

/* and when the loop is done, we return the processed text to be used however needed */

}

function gotoit(){

/* fill the "steve" div with the processed contents of the "randy" div. */

document.getElementById("steve").innerHTML = hereDoc("randy");
}

</script>

<a href="javascript:gotoit();">Run the script</a>
<div id="randy" style="display:none;">
The interesting thing about $andy is that he's not nearly as popular with young kids as $handy.<br><br>

What I really find 'interesting' is that this "multiline" thing works as well as $handy's bionic arm. <br><br>
</div>
<div id="steve"></div>

Note the unescaped single and double quotes on the last line. You can do much more complex HTML like a form, but I'm just trying to keep this simple for the example.

In the end, this is more a proof of concept than anything else. You can do this a bunch of ways, but I like the way it iterates through the javascript variables in the document and processes the text the same way you'd see that handled in a PHP script. Of course, I haven't added in the handling of array variables, but it could be done.

There's also the reverse option of using regular expressions to search for single words meeting string naming rules, prefixed with $, then checking to see if there's a variable with that name that's a string or number value and doing a global replace for it. If you had a whole lot of variables in a highly complex script, it might be faster.

Anyway, I'm rambling. I've tested it on Firefox, Safari, and Opera on my MacBook Pro running Tiger. Try it out on your browser if you like...

Run the script

  • Share/Bookmark

Comments No Comments »

Been working on a new project that will go into previews for selected friends and family later this week and go into general release a few days later, once I've had a chance to fix any issues my preview users find.

One part of the project was to add "Tweet This" links to certain items on the site, where clicking on the link would send the user to Twitter and fill in the suggested text of the tweet for them.

It's actually a lot simpler than you think. The format is: http://twitter.com/?status=[URL encoded tweet text].

Now many of you are asking how you "URL encode" the tweet text. Well, if you're using PHP, you use urlencode('text'); where "text" is the text of your tweet. If you want to do it in JavaScript, the PHP.js library has a javascript equivalent for PHP's urlencode.

But here's a little trick I didn't know about until I made this mistake. Make sure your link goes to twitter.com, not www.twitter.com. If it goes to www.twitter.com, the text doesn't get properly decoded. So, trying to tweet Creating a "Tweet This" Link - http://www.brainhandles.com would look like Creating a %22Tweet This%22 link - http%3A%2F%2Fwww.brainhandles.com, and nobody wants that.

  • Share/Bookmark

Comments 7 Comments »

Get an angel for your site An Angel Watches Over This Site