Added JavaScript syntax checking via Esprima and a Git pre-commit hook

I came across a brilliant project the other day – Esprima from Ariya Hidayat, the author of PhantomJS.

What is Esprima? Esprima is a JavaScript Parser written in JavaScript Syntax Validator. It forms the basis of several different tools – a minifier, a code coverage tool, a syntax validator – just to name a few.

I was immediately interested in the syntax validation tool. It’s not a linter – it just checks that the JavaScript written is syntactically correct. Why would you want this if you already have JsHint and JsLint?

  1. It is extremely fast. Validating three.js (800 KB source) takes less than a second on a modern machine.
  2. It looks only for syntax errors, it does not care about coding style at all.
  3. It handles generated files as the result of minification or compilation (CoffeeScript, Dart, TypeScript, etc).
  4. It tries to be tolerant and not give up immediately on the first error, especially for strict mode violations.

Esprima is available as an npm package, so installing it only takes a second:

sudo npm install -g esprima

Using Esprima from the command line is simple:

esvalidate file.js

The only thing to note about running from the command line: if the validation succeeds, you won’t get much in the way of confirmation. Which can be painful if you are processing a whole directory. You only get useful feedback in the default mode on error.

However, if you don’t mind reading a little XML:

esvalidate lib/*.js --format=junit

Prints junit XML which at least you can visually parse to see which files were validated.

Where would I use this where I might not use JsHint? As a pre-commit hook to screen my checkins. Instead of going through a check-in, building everything, then running JSHint just to hear that something is not up to spec, I can add a little script that will do a quick sanity check of my JS before I go to commit anything to git.

If you’ve never created a pre-commit hook before, it’s pretty easy. Two lines in bash will give you a pre-commit file:

touch .git/hooks/pre-commit
chmod +x .git/hooks/pre-commit

This is windows version of the code for the pre-commit hook:

#!/bin/sh
files=$(git diff-index --name-only HEAD | grep -l '\.js$')
for file in $files; do
esvalidate $file
if [ $? -eq 1 ]; then
echo "Syntax error: $file"
exit 1
fi
done

To make this work on Linux, just remove the remove the #!/bin/sh line.

For more information about Esprima, check out this article by the author.

Advertisements

.Net SQL Parsing – Using the TSqlParser library

As a bit of a preface to this post: it is hard to find a free SQL Parser for .NET.

There is a company that has a terrible library that they charge $150 bucks for. There are a couple of incomplete implementations done for school projects or for narrowly focused tasks.

So if you want a no-strings attached free parser for SQL, you’re out of luck.

However, since most people who want a .NET parser are writing code on a Windows machine, and use Visual Studio, there is (lightly documented) hope: the TSqlParser library that ships with Visual Studio.

This is a fully featured parsing library for SQL Server SQL syntax. I’m not sure about the support of other DB’s SQL syntax, but I would imagine it’s poor.

On an x64 Windows machine, using Visual Studio 2010, the dll’s which contain the TSqlParser library are located at:

C:\Program Files (x86)\Microsoft Visual Studio 10.0\VSTSDB

The class TSql100Parser in Microsoft.Data.Schema.ScriptDom.Sql gets you the parser for Sql Server 2008.

To instantiate an instance of the TSql100Parser class, you have to supply the constructor with one parameter:

public TSql100Parser(bool initialQuotedIdentifiers )

The docs for this are better than trying to figure out what initialQuotedIdentifiers means:

Specifies whether quoted identifier handling is on.

I’m guessing this has to do with declaring aliases for columns like this:

select 
   bar as 'This is the alias for foo.bar'
from 
   tblFoo
--instead of like this:
select
    bar as [This is the alias for foo.bar]
from
    tblFoo

Using the parser is relatively simple. Once you reference the correct dll’s in your project:

var parser = new TSql100Parser(true); 
var script = parser.Parse(reader, out errors) as TSqlScript;

foreach (TSqlBatch batch in script.Batches)
{
    foreach (TSqlStatement statement in batch.Statements)
    {
        //At this point, you have a collection of SQL Statements... 
        //that can contain collections of SQL Statements...  
    }
}

My comment in the code above is to help you understand something about parsing SQL – almost every relationship is expressed as a tree, where something contains more of the same thing, and that thing may contain more of the same thing, or maybe not.

Which means the easiest way to navigate the data is recursively. In other words, the Rules for using TSqlParser:

  1. LEARN TO LOVE THE RECURSION.
  2. Refer to the Rules for using TSqlParser

I’ll give you an example scenario to show you what you’re up against.

One common scenario is searching your code for SELECT statements. Select statements can be contained in:

  • Stored Procedures
  • If Statements
  • While Statements
  • BEGIN statements
  • Try/Catch Blocks

I’m sure I’m missing some cases.

So it’s not as simple as saying “give me all the statements that are select statements”. Instead, you have to write something like:

function ProcessStatements(statements)

    foreach(statement in statements) 
        if statement is a Stored Procedure
           ProcessStatements(statement.MyStatements)
        if statement is an If Statement
           ProcessStatements(statement.MyStatements)
        if statement is a While Statement
           ProcessStatements(statement.MyStatements)
        if statement is a Select Statement
           ProcessSelect( statement)

So once you get your select statement (or your collection of select statements), how do you process them?

Well unfortunately it’s not straightforward. The SelectStatement class contains a field called QueryExpression – this field contains what kind of Select we’re dealing with.
As far as I can determine, there are three types of QueryExpressions:

  • QuerySpecification
  • This is an actual SELECT statement

  • BinaryQueryExpression
  • This is a UNION or similar expression between two SELECT statements

  • QueryParenthesis
  • This is a SELECT surrounded by parenthesis. In other words, a sub-select

So again, if you only want SELECT statements, you have to weed through the three types of QueryExpressions until you get to the underlying SELECT statements.

So eventually you’ll get to a list of QuerySpecifications (which represent the SELECT statements from your original query).

Now here comes the good stuff: you can now weed through the SELECT fields programmatically and get out whatever information you want. Here are some of the fields on QuerySpecification:

FromClauses     Gets a list of FROM clauses.
GroupByClause   Gets or sets a GROUP BY clause.
HavingClause    Gets or sets a HAVING clause.
Into            Gets or sets the into table name.
SelectElements  Gets a list of the selected columns or set variables.
TopRowFilter    Gets or sets the usage of the top row filter.
UniqueRowFilter Gets or sets the unique row filter value.
WhereClause     Gets or sets a WHERE clause.

Just tons of SELECT goodness. However, be warned: each of these fields contains lists with multiple subclasses. So more recursive diving if you want to get something very specific out of this select data.

To get farther you might have to dive into the docs.
Link to MSDN Namespace Docs

Free Collection of Microsoft E-Books

If you’re a Microsoft Dev, want to learn a bit more about the following products:

  • SharePoint 2010
  • Sql Server 2012
  • Visual Studio 2010
  • Windows 8
  • Windows Phone 7
  • Office 365
  • Office 2010
  • ASP.NET 4.5 Web Forms
  • ASP.NET MVC 4
  • Microsoft Dynamics CRM 2011 (God Rest Your Soul)

Microsoft has released a bunch of free e-books about these technologies (and a few others).

The links to the e-books:

Microsoft Free E-Books – Page 1
Microsoft Free E-Books – Page 2

Derby.js – The Ready() Function, and Adding Client-Side Scripts to your App

I’ve found a neat feature of derby dealing with the ready() function.

I’ve been creating a derby app, and in my application I need to load up a client-side calendar. With a standard HTML web page this is straightforward thing to do. On the page you wanted the calendar, you would include the client js for the calendar, some code to load it, and that would be that.

Derby introduced some complexity to this relatively simple task.

On my first attempt, I put my scripts in the section of the page that I needed the calendar on. I added a script to load the calendar as well. When I went to the url of the page, it loaded immediately. Success! (I thought).

Then I clicked a link away from my calendar, and then clicked back. No calendar.

What happened? When I loaded the link originally, the page was rendered from the server. The second time, the page was rendered client side. Something wasn’t working with loading the calendar on the client-side render.

You need a place to put your code that guarantees that it will load on the client-side. The app.ready() function is designed to handle this scenario.

What is the purpose of the app.ready() function? From the derby documentation:

Code that responds to user events and model events should be placed within the app.ready() callback. This provides the model object for the client and makes sure that the code is only executed on the client.

This function is called as soon as the Derby app is loaded on the client. Note that any code within this callback is only executed on the client and not on the server.

I’ve bolded the part I think is particularly important. This code is run as soon as the client is loaded. Which means if you have more than a single-page app, using app.ready() to load a feature might not work out.

So what to do? An undocumented feature of Derby – app.on().

What does it do? It allows you to load code AFTER a specific page has loaded. Which is exactly what I’m looking for.

So here’s what I ended up writing in my app.ready():

  app.on('render:calendar', function(ctx) {
    logger.log("rendering calendar client-side...");
    new Timeline("timeline", new Date());
  });

In this example this code will run after rendering the calendar view. On the Calendar page, there is a script block which loads the Timeline function.

After writing this code, I tried my page again. Still no luck. I wasn’t getting the calendar to load on both the client and server load.

On a lark, I moved the block containing my Timeline javascript to my index view, and tried again. Success.

Which makes sense. If you are rendering pages via the client side load, additional resources aren’t going to be loaded. So by putting my script into the index view, it’s unfortunately loaded for all my pages, but at least it’s available for both the client and server side rendering of my cool control.

Cracking the Coding Interview – The Tower of Hanoi and Poor Editing

I just finished the Stack section of Cracking the Coding Interview and came across an old puzzle – The Tower of Hanoi.

I struggled with solving this problem. I wrote this elaborate, strange algorithm to try to solve it (which should have been a dead give-away that I had it wrong). Ironically enough, hidden in the 20-30 lines of code I wrote were the three lines of code I needed to solve the problem.

Anyways, after beating my head in trying to solve this, I ended up going to the back of the book and looking up the solution.

And found this pile of shit psuedocode. I’ve shortened the comments, but the content is the same.

moveDisks(int n, Tower origin, Tower destination, Tower buffer) {

   if (n <= 0) return;

   //Move top n - 1 disks from 1 to 2
   moveDisks(n-1, tower 1, tower 2, tower 3);

   //Move top from 1 to 3
   moveTop(tower 1, tower 3);

   //Move top n - 1 disks from 2 to 3
   moveDisks(n-1, tower 2, tower 3, tower 1);
}

In this tiny amount of code, in over five revisions to her book, the author has managed to miss two errors.

  1. In the first line, Variables origin, destination, and buffer are declared, but then tower 1, tower 2, and tower 3 are used. Which is which?
  2. In line 5, the comment says “Move top n-1 disks”. Then, in lines 8-9, the author indicates that you move the top. Since you’ve already moved all the items in the stack but one, the bottom element, that line should read “move bottom”.

Maybe these aren’t huge issues, but since I was pretty frustrated with this problem, having to correct the solution didn’t help my frustration.

The correct code:

moveDisks(int n, Tower origin, Tower destination, Tower buffer) {

   if (n <= 0) return;

   //Move top n - 1 disks from origin to buffer
   moveDisks(n-1, origin, buffer, destination);

   //Move nth disk (the bottom disk) from origin to destination
   moveBottom(origin , destination);

   //Move top n - 1 disks from buffer to destination
   moveDisks(n-1, buffer, destination, origin);
}

I hope this helps somebody else who’s working on this problem.

Cracking the Coding Interview – Linked Lists – The Runner Technique

I’ve been going over the Linked List section of Cracking the Coding Interview and I’m realizing that almost every time I get stumped with a problem the solution is the Runner Technique (or slow/fast pointers).

The idea behind the runner technique is simple; use two pointers that either move at different speeds or are a set distance apart and iterate through a list.

Why is this so useful? In many linked list problems you need to know the position of a certain element or the overall length of the list. Given that you don’t always have the length of the list you are working on, the runner technique is an elegant way to solve these type of problems (and in some cases it’s the only solution). Here are some examples of linked list problems where the runner technique provides an optimal solution:

Where do you use it? If you get handed a linked list question, and you find yourself asking these questions:

  • How do I figure out where these two things meet?
  • How do I figure out the midpoint?
  • How do I figure out the length?

You’re most likely dealing with a problem where you need to use the runner technique.

How does it work? I’ll illustrate one of the examples above.

Given two lists of different lengths that merge at a point, determine the merge pointTerrible Branched Linked List Picture

  1. Start a node at the beginning of each list on the left.
  2. Count how many nodes each pointer encounters. (The top list should be 5, the bottom list 4.)
  3. The difference between the two numbers is the difference in length of the two lists before the merge point (the difference is 1)
  4. Move your nodes to the beginning of each list, but the longer list should get a headstart equal to the amount of difference. (so the top list would start on its 2nd element, while the bottom list would start on its 1st).
  5. Move each node forward at the same speed until they meet. Where they meet is the collision point.

Each of the links above contains code and solutions to each of the problems. I hope this example and the ones above show the usefulness of this approach.

There are also some examples located here of various linked list algorithms problems and solutions written in JavaScript.

Link

Sublime Text has rapidly become my favorite text editor. Cross platform, easy to use, great feature set. The Command Palette feature, where you can search for a feature without having to know where it is in the application, is an piece of usability brilliance.

Somebody cobbled together a great step-by-step set of directions on how to install sublime on ubuntu. I wanted to give a shout-out to them and their work.

How to install Sublime Text 2 on Ubuntu 12.04 (Unity) | Technoreply.