Social Media
GitHub
Navigation
Powered by Squarespace

Entries in github (2)

Saturday
Oct152011

Using Regular Expressions Part 2 - The Cocoa Connection

Last time, in Part 1 of this series, I wrote about the basics of regular expressions, and the phrases I tend to use. Today, I'm going to talk about the mechanics of how I use Regular Expressions in Cocoa.

But first, an historical diversion

In my opinion there are, two different ways that programming languages implement Regular Expressions: The perl/ruby way, and the Java/C#/Python/Cocoa way.

In ruby and perl, regexes are implemented directly on the String type, whereas in the other languages, there a separate object that contains the functionality. Here's what you need to know to do a regex substitution on a string in ruby:

myString.sub('pattern','replacement')

clean, easy, and immediately useable if you know what pattern you want to use.

Here's what you need to know to do the same thing in Cocoa:

+[NSRegularExpression regularExpressionWithPattern:(NSString *) pattern 
    options:(NSRegularExpressionOptions)options error:(NSError **) error]

-[NSRegularExpression replaceMatchesInString:(NSMutableString *) string 
    options:(NSMatchingOptions)options range:(NSRange)range 
    withTemplate:(NSString *)template]

which is not clean, not easy and contains a bunch of stuff you have to go look up to be able to get started. What are NSRegularExpressionOptions and NSMatchingOptions? What's a template? Do I really have to create an NSRange for this? And that leads to the obvious question: Is all this effort really worth it?

Now I don't know about you, but I don't want to spend any effort remembering any of those option parameters, and I don't want to take the time to look them up any time I want to use a regular expression. To me, the beauty of Objective-C is that it gives us the ability to build most of what you need to know directly into the method signatures.

Let's simplify things a little

So that's what I did. For the rest of this post, I'll be using the categories on NSString found in a repo I wrote on github called RegexOnNString.

There are three basic methods I wrote:

-(NSString *) [NSString stringByReplacingRegexPattern:(NSString *)regex 
    withString:(NSString *) replacement]

which takes a string, finds the occurrences of the regex pattern and replaces them with the string replacement.

-(NSArray *) [NSString stringsByExtractingGroupsUsingRegexPattern:(NSString *)regex]

which gives you an array of all the pattern groups (things in parentheses) it found in your string, and

-(BOOL) [NSString matchesPatternRegexPattern:(NSString *)regex]

which just tells you whether a pattern is present in your string or not.

There are two additional, optional parameters that you can add, caseInsensitive:(BOOL) ignoreCase and treatAsOneLine:(BOOL) assumeMultiLine.

caseInsensitive is hopefully self explanatory, and treatAsOneLine just means that you expect that your string has (or might have) newline (\n) characters in it, and you want them to be treated like any other character.

To get them, you just need to grab the MIT-licensed code from github, include NSString+PDRegex.h and NSString+PDRegex.m in your project, and put

#import "NSString+PDRegex.h"

in the top of your source file.

How about some examples?

The simplest of these is the one that returns the boolean, like so:

if(![emailAddress matchesPatternRegexPattern:@"@.*\\\\..*"]) {
    NSLog(@"If the user is going to give us a fake email address" \
          @" they could at least try and make it look like one" \
          @" by making sure it has an at-sign and a dot in it.");
}

I use this a lot for string validation. No sense in trying to send an email if there isn't an at-sign in it, and no use trying to convert an NSString to an NSURL if the string doesn't at least contain '://'. (Note that I have to use two backslashes there because a @"@.*\\." will cause Xcode to generate a: Lexical or Preprocessor Issue: Unknown escape sequence '\\.' warning).

The one I use next most often is the one that returns an NSString. I use this one for extracting substrings. For example, in:

<Warning: Shamless Plug> a Mac application I recently released as an Open Beta that helps iOS developers deploy Apps to their test devices without having to use a USB cable <End Shamless Plug>

I'm getting a string that is the path of the .app file that the user dragged-and-dropped onto my App (and it's either a path if they dropped it onto the App Icon in the Dock, or a URL if they dropped it onto the Window). From that string, I need to figure out what their iOS App is named (so I can use that name in the notification). I use the stringByReplacingRegexPattern method for that. I could use [[NSString lastPathComponent] stringByDeletingPathExtension] for that, but by using regexes, I don't have to go look up the path component methods, like I just did to put them in this post. But an even better example from that app is:

NSString *dSYMPath = [droppedPath stringByReplacingRegexPattern:@"\\\\.app$" withString:@".dSYM"];

So that I can save off the dSYMs so that the user can get to the symbol data for that build if they need it later.

I also use it so that, in order for my App to get the user's Dropbox's public URL, I can let the user drop any Public URL that Dropbox gives them into the preferences panel, and I can use:

 NSString *dbPublicRoot=[pastedLink stringByReplacingRegexPattern:
    @"^(http://dl.dropbox.com/u/[0-9][0-9]*)[^0-9]*.*$" withString:@"$1" caseInsensitive:NO];

so that I don't have to rely on the user to correctly truncate the URL at the right place, and the user doesn't have to think about it.

The last method, the one that returns the NSArray, I don't use as often, but when I do, it can save me a lot of effort. For example, recently I was implementing a Tic-Tac-Toe game as a programming exercise as part of an interview process. So when I was shipping turns between the two players, I was actually sending one string:

NSString *stringForThisMove = [NSString stringWithFormat:
      @"Move %@=Player %@ to Square %@\n",
      [move TurnNumber],
      playerThatMoved,
      [move SquarePlayed]];

and then on the receiving end, I used:

NSArray *extractedStrings=[moveString
       stringsByExtractingGroupsUsingRegexPattern:
       @" *^Move  *([0-9]) *= *Player  *([X|O])  *to  *Square  *([0-9]) *$"
       caseInsensitive:YES treatAsOneLine:YES];

from which [extractedStrings objectAtIndex:0] was the move number, [extractedStrings objectAtIndex:1] was the player (@"X" or @"O") and [extractedStrings objectAtIndex:2] was the number of the square they moved to (where the first row of the board was 1-2-3 and the last row was 7-8-9).

Now, there are many other ways I could have encoded that, but the nice thing about using strings for it was that anyone looking at the intermediate value (in the debugger or logs) could easily tell what move was being talked about at that point, and if I were ever to need to come back to this code later, @"Move 1=Player X to Square 1" will make sense to me (after all, that kind of notation has been of use in the Chess world for hundreds, if not thousands of years).

But aren't Regular Expressions slower?

Well, define slow :-).

In the test suite for my RegexOnNString category, I have a test that does 1000 string replaces:

for (uint i=0; i&lt; 1000; i++) {
    if (lastTimeString) {
        NSString *currentNumberString=[NSString stringWithFormat:@"%u",i];
        NSString *replacementString=[lastTimeString stringByReplacingRegexPattern:@"[0-9][0-9]*" withString:currentNumberString caseInsensitive:NO];
        STAssertEqualObjects(replacementString, currentNumberString, @"regex replace failed");
    }
    lastTimeString=[NSString stringWithFormat:@"%u",i];
    i++;
}

Now each of those regex's is different (by design), so I can't compile them, and I'm creating and throwing away my NSRegularExpression object and a temporary NSString every run. So it's near a worst-case scenario. By way of comparison, I do another loop of 1000 replaces, using [NSString stringByReplacingOccurrencesOfString: withString:] to see how much slower the regex makes the task.

The output from running the test on my 4thGen iPod touch is:

2011-10-15 17:23:00.748 RegexOnNSStringIOSExample[224:607] 1000 regex replaces took 0.167364 seconds
2011-10-15 17:23:00.776 RegexOnNSStringIOSExample[224:607] 1000 String replaces took 0.025167 seconds
2011-10-15 17:23:00.777 RegexOnNSStringIOSExample[224:607] Simple String substitution 6.650140 times faster

and on my daughter's 2nd Gen iPod touch, the output is:

2011-10-15 17:33:37.631 RegexOnNSStringIOSExample[183:307] 1000 regex replaces took 0.641442 seconds
2011-10-15 17:33:37.756 RegexOnNSStringIOSExample[183:307] 1000 String replaces took 0.119230 seconds
2011-10-15 17:33:37.768 RegexOnNSStringIOSExample[183:307] Simple String substitution 5.379869 times faster

So yes, it's slower. It takes 0.17 milliseconds on a 4th-gen touch and 0.64 milliseconds on a 2nd-gen touch. And it's between 5 and 7 times slower than stringByReplacingOccurrencesOfString:withString. If 0.52 ms really matters in your code when running on a 2nd-gen touch, then you should use stringByReplacingOccurrencesOfString:withString instead.

So, in conclusion,

I hope you found this post useful. If you need to do string manipulation, Regular Expressions are a time-tested way to do that, and I hope the extra methods I've talked about here will simplify things if you want to do string manipulation in your Cocoa code.

Saturday
Sep032011

Steal This Code and Protect Their Data: Simplifying KeyChain Access

Invalidname Meet iPhone Explorer Invalidname Learn Keychain Noel Llopis Keychain is Obtuse

 

 

The Code

The last couple of months, I've been working on my first Mac App (more on that in a later post).  As part of this App, I'm calling a REST API that requires that I have the user's password for that service to use in the API calls.  Although that API is a minor part of the App, and although the service doesn't have horrible consequences if someone gets the user's password for it (in my opinion at least), there was no way I was going to store that password on disk unencrypted.  After all, users have a bad tendency to use the same password for multiple services, and one of those other services might contain important information.

So I dug into the Keychain documentation, and it took me a while to figure it out.  Meanwhile, I was learning Bindings for the Mac App, since in my time programming iOS, I'd never had the chance to use Bindings before.  And I decided that it was a good opportunity for me to combine the two and learn something, and maybe help someone else along the way. I fought with it off and on for a month or so, and released it under the MIT license at the end of July.

This is the result.  It's a project that simplifies using the Keychain by making it accessible through methods patterned after NSUserDefaults.

Their Data

So here's the problem, anything you persist in your App unencrypted can easily be extracted by a program like this,   If you put your own encryption in your App, you could run afoul of Apple's encryption policies and potentially Law Enforcement Organizations.  The KeyChain makes it possible to protect the data that you persist from (all but the most determined) prying eyes.

Now many programmers don't think they're persisting any data that they need to protect, because they don't get passwords from their users.  But think for a minute about other information the user might not want anyone to be able to see.  And then think about any data that you wouldn't want the user to be able to read (or alter).  I don't write games myself, but when I talk to my friends that do, I hear them complain a lot about people "cheating" by trying to hack their save games.  While you wouldn't want to stick a huge amount of data in the keychain, some strategically selected pieces of data (current amount of "gold" the user has, or maximum hit points) might be appropriate to store in a safer location than in a file on disk.

How to Use it

I intentionally wanted to write this library to be as easy to use as possible, so I decided to make it match the semantics of NSUserDefaults, since that's in every iOS programming book I've ever seen, so in theory, it should be well known to anyone needing it.

To install it, check it out from github, grab the 4 files in the PDKeychainBindingsController folder (the .h and .m files for PDKeychainBindings and  PDKeychainBindingsController) and drag them into your project in XCode.

Then, when you would normally have used:

[NSUserDefaults standardUserDefaults]

You should be able to call

[PDKeychainBindings sharedKeychainBindings]

instead (at least for the most common methods).  If you're doing an OS X App, and you're binding a NSTextField or the like, then where you would have called

[NSUserDefaultsController sharedUserDefaultsController]

use

[PDKeychainBindingsController sharedKeychainBindingsController]

instead (again, at least for the most common methods).

There are two differences, the first is that the Keychain API only wants to work with Strings (well, NSStrings).  So if you want to store something else in there, you need to convert it to a string yourself before you put it in the keychain (and change it back it when you take it out).

The second is that, in order to simplify it, I took out the need to run the synchronize method.  As soon as you call the set method, it gets persisted.

I'd like to thank Chris Adamson and Noel Llopis for unwittingly helping me decide on the topic for this post.