Thursday, April 16, 2009

Bookpool.com is Dead!

I just tried to visit BookPool.com, one of my favorite web sites for reasonably priced technical books, and got this message…

 

image 

Searching Google yielded several web sites that indicate that bookpool has indeed vanished into the night, without a trace.

Thursday, January 15, 2009

Bursting Legal Documents

The mega-firm at which I am employed produces a lot of documents. A lot. Some of the more important documents are financial statements which communicate to the attorneys how much moolah they’re making (or not, as the case may be).

One of our vendors provides several tax documents as one big honkin’ PDF, many 1,000’s of pages long, comprised of individual partner statements all jammed together, one after the other. Our financial folks pulled their hair out year after year trying to divide these big documents up into individual statements for each attorney. They would literally spread out the whole mess on a huge conference room table and create little piles of paper.

This might explain why most of the people in our Finance dept have offices littered with little stacks of paper. But I digress.

It fell my lot to create a solution to this little problem. And the solution was found in a little technology called “bursting”, otherwise known as “splitting” a single document into multiple individual documents.

There are several open-source utilities that will take a single large document and split it up into multiple single-page documents, but most didn’t fit the bill at all. We needed a solution that would…

  • Split a document based on text indicators (tags) that would let the splitter know where new documents should begin.
  • Allow the user to specify at least a portion of each split document’s file name. Most bursters will just create a random filename for each split document, without allowing you any control.
  • Allow us to hide the tags by coloring them white, so they don’t appear to anyone reading the reports.

The best (and possibly only) solution I could find that matched these criteria was PDF-Explode. I say “and possibly only” because so far I haven’t found any competing product that remotely meets these criteria.

To burst a document using PDF-Explode, you must have some control over the content of the document you’re going to burst. In our case, we got our vendors to modify the reports they sent us, or we changed any reports we produced internally. The modification is simple enough: to trigger PDF Explode to create a new document, you simply add a <pdfexplode> tag to the top of the page, thusly…

<pdfexplode>NewFileName</pdfexplode>

Then, you run the PDF-Explode application from the command-line, passing it the name of the “master” PDF file, and a few seconds later you have a horde of new “bursted” files!


But wait a second… if you can automagically split a ginormous PDF into individual documents, and you can control the filename of each new document, why not go one step further and create an automagical delivery mechanism?

And that is exactly what I did. I built a custom ASP.Net application that would, when visited by an attorney, scan a folder for files that matched their employee number, and display them as links for the attorney to download.

Quick tangent: I must note here that PDF-Explode can automatically email bursted documents to an email address you specify in each tag, if you so desire, but that was a little scary for us. Our users wanted time to review and reflect before publishing anything.

So now it takes just a couple of hours (and many times much less) to burst a document and drop the files into a folder, vs. about a week doing things the old way. And attorneys love being able to get their financial documents whenever they like from a web page (I did, in fact, implement several security measures, which I may go into in a future post).

Does this make sense? Let me know in the comments if you have any questions about bursting, or want to know more about how this was implemented.

Wednesday, December 24, 2008

ADO.Net Entity Framework – Assembly Not Found Exception

Allow me to share with you the resolution for one of those simple things that should take a few minutes to resolve but ends up taking 3 days.

I’m using the ADO.Net Entity Framework within Code Toaster, my template-based code generation tool, to retrieve and store statement completion metadata to a SQLite database. SQLite newly supports the Entity Framework Designer from within Visual Studio.Net 2005 and 2008, and all that works wondrously.

But when I ran my code in the debugger my entity context object’s constructor threw an “Unable to load System.Drawing, Version=1.0.3300.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a” exception. Oddly enough, the version number was for the .Net 1.0 version of System.Drawing. Code Toaster is 100% .Net 3.5 code (or so I thought), which means the version number should have read 2.0.0.0 (.Net 3.5 is really .Net 2.0 plus some extras, like WCF and WPF).

I enabled .Net Framework source stepping (an awesome new feature in VS.Net 2008 SP1), and stepped through the actual .Net Framework source code to try and figure out why .Net felt it needed to load the 1.0 version of System.Drawing when establishing a connection to SQLite.

image

Enabling .Net Framework source stepping

As it turned out, .Net was trying to load lots of assemblies! It appeared as though it was traversing through every assembly I was referencing, as well as every assembly those assemblies referenced, recursively! Dude.

I cranked open Reflector and examined every assembly I had explicitly referenced. Sure enough, an assembly from Actipro Software was referencing System.Drawing, Version=1.0.3300.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a. The next question was – why was the .Net Entity Framework doing this?

After some more Googling and soul searching, I discovered precisely why. When you add an .edmx model to your project, an Entity Framework connection string is automagically added to your app.config (or web.config, as the case may be). Upon closer inspection, one realizes that this connection string is not like others you may have seen before.

For one, it includes a metadata section, which looks something like this:

metadata=res://*/IntellicacheModel.csdl|res://*/IntellicacheModel.ssdl|res://*/IntellicacheModel.msl

The metadata section is a map that tells .Net where to find these three resources. res://*/ means “search all the assemblies you can find, and don’t stop until you’ve either thrown an exception or found what you’re looking for.”

I replaced the asterisk with the name of the correct assembly (the one whose project that contained the .edmx model; you can also verify by inspecting your assembly in Reflector for these three resources), and it worked! Eureka!

I suppose if I had simply read the documentation first I could have saved myself some trouble.

Monday, December 8, 2008

Microsoft Installation Troubles

I cannot install Microsoft Expression Blend 2. If you look here, you'll see that lots of other people can't either. The problem seems to occur if you had installed and then uninstalled a pre-release version of Blend - it hoses up future attempts to install later versions. I even read about one dev who just wanted to reinstall the product to try and fix an Intellisense issue, but the installer wouldn't even let him do that.

I did, however, get it to install in a pristine virtual machine. But lo, I needed to upgrade to Blend 2 SP1 to get the latest Silverlight 2 support. And to do that, I had to install the .Net Framework 3.5 SP1 update.

The screenshot below shows what happened next. This is the most bizarre frustrating thing. Can you tell what's messed up about this picture?

image

That's right, the installer is telling me that it can't continue until I close... the installer!

Wednesday, November 12, 2008

Online Charting and Graphing

If you have a need for powerful web-based charting and graphing with a low learning curve, I highly recommend either of these two solutions.

  • Flot - Flot is a 100% javascript-based library built on top of jQuery, and which can generate very nice looking  charts and graphs (with tooltips and such-like). Plays nice with Internet Explorer 6, Google Chrome, Firefox 2.x+, Safari 3.0+, Opera 9.5+ and Konqueror 4.x+, but in my testing does NOT work with IE 7 OR 8.
  • Visifire - Visifire is a great open-source charting platform using Silverlight. Silverlight must be installed on client workstations, but I've had remarkable success using Visifire. The web-based chart designer is a great starting point to using this incredibly powerful tool.

Now go wow the boss!

Wednesday, October 1, 2008

Code Toaster AppDomain Magic

My personal project for almost 4 years now, Code Toaster, is almost ready to show to the world (again, if you've been following this blog for a while). Code Toaster is an Integrated IDE, and libraries, for rapid creation of developing code generation templates. You can find a lot more information over at http://www.codetoaster.com/, including documentation, code samples, and video tutorials.

But in this post, I'd like to take a look at a few of the more difficult architectural challenges I faced when building Code Toaster, all of which were solved by taking advantage (or working around issues with) the .Net Application Domain (aka AppDomain). But first, some background information.

The AppDomain

A .Net process consists of one or more Application Domains, which are used to provide code isolation, security, and so forth. Code running in one Application Domain cannot directly invoke code running in another Application Domain, however methods on the AppDomain class make it possible to load a .Net assembly into a new AppDomain, create an instance of a class in that assembly, and invoke its properties and methods.

AppDomain's are important to Code Toaster for several reasons. The foremost is that, once an assembly is loaded into an AppDomain using Assembly.Load (or one of its associated methods), the assembly cannot be unloaded from the AppDomain. In several cases it's necessary for Code Toaster to temporarily load an assembly to retrieve type (or other) information contained in an assembly, and then unload the assembly to reclaim the memory used.

Scenarios that benefited from using AppDomains

  1. In one scenario, Code Toaster loads assemblies in the GAC to determine each assembly's target runtime. Loading every assembly in the GAC into memory could consume roughly 25-50 megabytes of RAM : it's nice to be able to reclaim that memory afterwards.
  2. In another scenario: when Code Toaster compiles a template project into a new assembly, that assembly must be loaded (into some AppDomain) so that it can be executed. As the developer iteratively modifies, compiles, modifies, and compiles, each time loading a new assembly, memory usage can slowly add up (each compile loads a new assembly). It's nice to be able to reclaim that memory after each compilation.
  3. Another very important scenario occurs when debugging templates. Attaching the Visual Studio debugger to a template executing in Code Toaster's AppDomain causes the Code Toaster process (codetoaster.exe) to end when the debugger detaches. This is not unique to Code Toaster: the popular CodeSmith code generation tool suffers from the same problem. As I was later to discover, executing templates in their own AppDomain causes debugging to operate as desired (detaching the debugger does not kill the process).

The figure below shows how Visual Studio attaches to a separate AppDomain within the CodeToaster.exe process, whose purpose is solely to contain and execute templates in isolation. This provides several benefits:

  1. Each time a template is recompiled, the "template execution" AppDomain is unloaded and recreated, reclaiming any memory space used by the previously loaded template assembly.
  2. Debugging works as expected (detaching the Visual Studio debugger from a running template will not crash CodeToaster.exe).


This was easier said than done. It's one thing to run some code in another AppDomain. It's quite another to load a Type in AppDomain B into a Property Explorer in AppDomain A, and have it appear and work correctly, with the appropriate adornments for UITypeEditors and so forth (see the figure to the left : the template BDC.Sample is loaded in another AppDomain, but appears correctly in the Template Property Explorer).

In future posts I intend to take a closer look at the code that made all of this come together. For now, I invite you to head on over to http://www.codetoaster.com/ and check out the finest code generation tool ever made!

Wednesday, September 3, 2008

Internet Security

This video from F-Secure corporation provides a great outline of how malware and other virii have evolved over the last decade or so, and why security (in particular browser security) is of critical importance.

And along those lines, Google Chrome has already been found vulnerable to a big nasty security glitch.

Tuesday, September 2, 2008

Google Chromium (Chrome)

I'm sure by now you've heard, and perhaps played with, Google's new free, open-source web browser: Google Chromium.

If not, well... I've spent the last 30 minutes messin' around. And as far as the overall browser experience, I must confess to not being terribly impressed. The "new tab" feature that keeps track of your favorite sites is pretty cool. And it is lightning fast, except where the "OmniBox" seems to update a little jerkily.

But the developer tools are a different story. It looks like someone at Google decided to spend a good ol' chunka time making their browser very developer friendly.

To launch the Chrome "Inspector", right-click anywhere on a page and click the Inspect Element menu item. From there, you can browse the DOM, inspect javascript errors, investigate resource problems (broken links, large files), and so much more.

Google Chrome Inspector

There's also a nifty-orama javascript debugger. See the screenshot below for an example of Chrome Inspector catching one of my javascript bugs.

Google Chrome Debugger

And then there's the javascript console window, which seems to actually work(!), and comes with a slimmed down and barely there version of Intellisense (but Intellisense no less). I'm very excited about this because it's hard to find a readily available javascript command window that just works. And I've been having issues with FireBug of late.

Oh yeah, I'm going to have loads of fun with this. If you're a web dev, I encourage you to give it a whirl.

Technorati Tags: ,,

Thursday, August 21, 2008

Red Gate purchases Reflector

Red Gate software has purchased, and take over future development of, Lutz Roeder's uber-popular tool.

Under an agreement announced on Wednesday 20th August , Red Gate will be responsible for the future development of .NET Reflector, the popular tool authored by Lutz Roeder. Red Gate will continue to offer the tool for free to the community.

Is this good news? I'm not sure. Red Gate says they will continue to offer Reflector as a free community-based tool, so it remains to be seen what this really means for .Net coders everywhere.

Click the link for full article.

Technorati Tags: ,,

Friday, August 15, 2008

Keep It Simple...

Programming is like sex. One mistake and you have to support it for the rest of your life.
~Michael Sinz

I was in the process of reviewing some documentation for a bit of popular legal software my employer purchased a while back, when I stumbled upon an interesting tidbit. I recognized it immediately for what it was, but I'll get into that a bit later. 

This particular vendor provides a RESTful API into their offerings. When invoking several of said API's methods, you can specify, as a parameter, a contactId in either one of two formats, termed Windows and Web.

The Windows format looks like this: 2/2195. Essentially two numbers delimited by a slash. Easy enough!

The Web format, on the other hand, looks like this: 8589936787.

So, to call their RESTful findContacts method, I would pass one of these two formats, thusly:

http://…/findContacts?contactId=[insert contact id here]

Reading further, I discovered that the Web format is actually a fancy calculation computed against the parts of the Windows format, thusly:

  • Source ID = 2
  • ID = 2195
  • Web ID = (2 * 2^32) + 2195

The result of that calculation would give us the longer Web format: 8589936787.

According to the documentation, this was recently added as a "feature", which I suppose could be marketed as "slashless Web Ids". But, as far as mine eyes can see, really adds no value to the application. 

What it does offer is oodles of cyclomatic complexity, the extent of which may not be apparent at first glance. Let's take a closer look:
  1. Following this change, all developers on the project need to be educated that IDs can be input in two different ways. Why do they need to be aware? So that they don't assume that all IDs going forward will contain a slash, and can revise the way they handle ID input. Maybe they need to call a different function to parse IDs. Maybe they don't need to do anything at all. But I'm betting that somebody, somewhere, will either need to send an email about this change, spend an hour discussing it in a meeting, or add it to the project's internal documentation.
  2. The test cases developed before this change are no longer valid for testing ID input. Now you have to test the case where ID input doesn't contain a slash (the Windows format), as well as the detection of the ID input format (i.e. running different code based on whether the ID contains a slash or not).
  3. In this case, a new web method was developed, convertDualId, which does the math for any math-challenged developers out there. Something else that needs to be tested.
  4. The documentation I'm reading contains three (3) HTML (.chm file) pages of documentation on this topic, explaining the difference between the two formats, and how to do the math to calculate the Web Id if you don't want to call the new convertDualId method. All of which needs to be validated, proofread, and so on.

And all that for... what? Seriously. To calculate the Web Id, I still need to have the two original slash-delimited values, so what am I gaining? I'll tell you what I'm gaining: an id that doesn't contain a slash.

Woop-te-doo.

If a slash is bothering you that bad, I can think of several other much simpler alternatives. Like, say, just HTML-encoding the thing as %2F. But of course that wouldn't properly showcase the developer's creativity, virility, and prowess.

And therefore I must conclude that what this is, in fact, is gold plating at its finest. An anonymous developer's proud signature, showcasing his unique ability to do maths.

Hardly aware is he that writing code like this is akin to locking a ball and chain around his ankle, forcing him into supporting this unnecessary code until the end of time. Or at least until he realizes what he's done and leaves for unspoiled pastures.

Keep It Simple, Stupid.