2010.03.04

Grepping though things

02.01.09 - Mark

I've been aware of GREP for years. I sort of remember it as a feature in BBEdit Lite, the great program I really learned HTML with, but always made do with the regular, plain english search and replace commands to fix problems. That stayed true when I finally moved to TextWrangler (after it was released as freeware to fully replace BBEdit Lite).

I also knew GREP or sometimes referred to as Regular Expressions, was in Perl and PHP, saw it on XKCD, both as a comic and as a t-shirt. Probably knew it was available at the command line of OS X and Linux. Those references and bits of knowledge made me aware of it, but it wasn't until earlier this year that I was given a task that I probably could have hacked though it with traditional search and replace, but needed enough changes made that I felt it would probably take less time to learn how to use a bit of regular expressions.


Regular-Expressions.info helped me a lot, but it still took a little longer to figure out than I had guessed. However, it was well worth the effort, over and over the GREP has proven to be very helpful, and while I'm not a master of the syntax, I can do a bit of damage with it in Textwrangler without a cheat sheet.

About two weeks ago, I stepped up to using some grep in a PHP script.

I've been off and on reading Rockwood Comic for years, but it's lack of an RSS feed sort of pushed it back further. The odd thing is I remembered a bit about scraping websites to create RSS feeds. While there are plenty of tools out there that will do the same thing, part of me figured it would be simpler, more precise and up to date, and a fun little challenge to create it myself - at least for that comic site. Plus once it was kicking around in my head, I knew I'd be using GREP in PHP to dig out some of the content. Once I think it's working a bit better I'll think about writing another post on the hack, especially since this post is mostly rambling on about how wonderful a tool GREP is for geeks, and that I created an RSS feed for Rockwood Comic using PHP, GREP, and cron to write the RSS file it, all wrapped up in Feedburner to give it a prettier URL than the sandbox address my code resides at.

Link | 0 Comments |


2007.12.04

Clearing out the undergrowth

13.51.11 - Mark

My RSS feed reading has been on and off for a few months. Part of that is flaky hardware, most of it is the mess of RSS feeds I try and keep up with. Up until this week I've been keeping related feeds together. All the news in one folder, all the blogs in another. It was a tangled mess and my irregular reading left a lot of dead feeds in the system next to some hyperactive feeds reporting hundreds of unread posts. When I was browsing feeds half the time was finding the good feeds with information I wanted.

Last week when 43 Folders posted a tip on organizing feeds by the value of the source rather than the topic I set it aside to use as a guide line for sorting out my RSS mess.

I'm not 100% done with the sort and toss, and I've got a folder filled with broken feeds, but I can already tell the new system is working for me. I'm staying caught up with my low signal to noise ration feeds, nearly caught up with middle of the road feeds, and I'm almost comfortable with ignoring everything I have ranked below that threshold. It feels good to have organized feeds again. Which I suppose means I need to stop putting off some of my planned code updates for this site...

Link | 0 Comments |


2006.01.26

Site Feed Improvements

17.42.04 - Mark

The default site feed should have enclosures now. I also fixed the bug mentioned earlier. Stupid Ampersands.

Link | 0 Comments |


Blog Engine Code Updates

15.11.50 - Mark

I'm squashing some bugs in the blog code right now. Specifically the one for my categories where items with non alphanumeric characters screwed it up. This wouldn't have been a problem except I've tagged a few things with spaces. I mainly did that one for myself. I got tired of seeing spaced categories making up my error reports.

The next major changes are going to be to the RSS feed. I've made 4 links in the last 20 posts that mucked it up (which is why all my feed is showing up as raw HTML) I've found the 4 responsible URLs, so I'll start working out a fix. Depending on how complex that gets, I'll probably get around to putting in enclosures on the main feed, as well as a separate media enclosed feed.

Link | 0 Comments |


2005.12.31

Welcome to Blog 2.0

14.05.54 - Mark

After many hours of programming, many, many more hours fiddling around with the database, and a few seconds of changing some settings, I'm considering myself moved off of blogger.

I'd be yelling and screaming with delight (because I was extremely tired of blogger) but for as much work as I've put into this its actually very anti-climatic change over. The last couple of hours were spent fighting with a little navigation script and the blogger redirection tool (which I'll be waiting a couple of days to activate)

Let me know if anything fails horribly. I'm going to step away from the code for a day or two (I haven't opened up my RSS reader in a week)

And sorry for breaking the feedburner feed. Unfortunately it might happen again a few times as I work out kinks in the RSS script.

Link | 0 Comments |