My Pedantic Antics

Years ago, I heard someone tell a story about a boy who never said anything. Parents knew he could talk, but he just didn’t. Nodding yes or no, shrugging, but not talking. One day, at the age of seven his family was sitting around the dinner table and suddenly blurted out: “These beans are LOUSY!”

His parents, completely shocked were taken aback, their boy had finally spoken. The couldn’t stop themselves from shedding their tears of joy and celebration. When things calmed down they asked him, “How come you’ve never said anything all these years?” and the boy responded: “I dunno, I guess everything has been fine until now.”

I say all that so I can simply tell you that I don’t always have something to say — which is why this blog has been so inactive for so long. I always swore up and down that if I really didn’t feel I had anything valuable to add, that I simply wasn’t going to add to the noise.

However, I recently posted a tweet that got some discussion. Now, this was mostly in fun. I love to talk about stuff like this. Coding preferences, pet peeves, etc (which is obvious from a series of our episodes[Pet Peeves Episode 1,Pet Peeves Episode 2,Pet Peeves Episode 3] of @SystemDotDebug). However I had a few responses remarking on how “pedantic” it was. Now I didn’t really take this personally, but I did start to think about whether or not such things are *really* important. I started thinking about it so much that it’s even going to be the subject of our next live podcast (at least its supposed to be, these things often change — and quickly).

I’ve never worked for a company that actually had published coding standards. Perhaps that is because I’ve pretty much only ever been responsible for my code. Working for small firms as I have my entire career, I haven’t had to “share” my code with many developers. Those that I have had to, just so happened to believe as I do, so I think I was just lucky.

At the end of the day, if the code works, it works right? There are questions however that perhaps you should ask yourself (yourself either being you personally or your company as a whole):

  • Is the code easily readable? (Code Like a Poet — thanks Bonny!)
  • Can junior programmer catch on to what the code is doing easily?
  • Does the code have a consistent look and feel to it or is each file/class clearly written on different days by different devs?

The people in the Python community have been doing this way longer than most of us, and there is a REASON they wrote something called PEP-8 and it wasn’t to be pedantic. I’m going to quote something from that resource that I’ve always felt: “…code is read much more often than it is written” — I’m just gonna let that sink in for a second…I’ll wait…..and follow it up with another quote from that same paragraph that references PEP 20: “Readability Counts.”

PEP-8 also introduces the idea of knowing when to “break” the guidelines. (Like Captain Barbossa says “the code is more what you’d call ‘guidelines’ than actual rules.”) so its okay to break them when they make sense. Say perhaps editing tons of old code, you have two choices. Clean up the code while your in there, (that’s my preference, if you’re already in there, do it) or adopt the style used in that class. Sometimes its so off from what everyone is doing that you *have* to update it, (or maybe that’s just me being pedantic).

Again, I’m going to site PEP-8 and list out their thoughts on when to break the guidelines verbatim:

1. When applying the guideline would make the code less readable, even for someone who is used to reading code that follows this PEP.

2. To be consistent with surrounding code that also breaks it (maybe for historic reasons) -- although this is also an opportunity to clean up someone else's mess (in true XP style).

3. Because the code in question predates the introduction of the guideline and there is no other reason to be modifying that code.

4. When the code needs to remain compatible with older versions of Python that don't support the feature recommended by the style guide.

Notice how I’m conveniently NOT telling you WHICH standard to follow (if there are multiple standards, is it really a standard?!?) — I’m just telling you that you should strive to pick ONE and follow it throughout your team. We could talk/argue at length over which method of curly braces placement you should use, or the value of ternary if statement usage (I really do hate them…I do, I do, I do), but that is all down to preference. Overall its about being consistent.

So is your cuddling or non-cuddling of your “else” statements important? No, as long as everyone is doing it the same way — at least across a given project. So then I guess, Yes? Which is it? Yes or No? Maybe I’m just being pedantic…

:wq!

I’m Embarassed

I’m embarrassed!  Every now and then — okay, quite frequently — I’m reminded that when I started my I.T. career I had no interest in being a developer.  It wasn’t until I got sick of getting certified in this technology or the other, and being married to a pager that I decided I wanted to learn how to be a developer.  I had no official training and am pretty much “self taught” — and at times, I feel that it shows through.

For example, I have a large client that has a huge email subscriber list that they send monthly emails to.  (Yes, they’re opted in so don’t go all *spammer* on me).  Occasionally they like to do targeted blasts based on zipcode (for those members we actually have a zipcode for).  This past month’s blas had over 28,000 zipcodes in the target list.  Due to an oversight, I wound up with a list of zipcodes that overlapped.  Basically it came down to having two files, one of 132277 lines and another with 113035 lines.  In these two files were approximately 30,000 or so overlapping email addresses that would have received both the targeted and non-targeted blasts had I not caught it.  *OUCH* that would NOT have been good.

I decided to parse through the two files with python since its syntax is fresh in my mind, and I’d have wound up googling too much had I decided to do it in bash and this was time-sensitive stuff.  So I busted out vi and coded up the following code (don’t laugh):

INFILE1 = 'all-email.lst'
INFILE2 = 'nofp-list.txt'

destinations = []
dupes = []

for line in open(INFILE1, 'r').readlines():
    for line2 in open(INFILE2, 'r').readlines():
        if line != line2:
            print line2
            destinations.append(line2)
        else:
            dupes.append(line2)

This code subsequently hung my machine as it struggled to loop over so much. I knew this wasn’t gonna be the final version as I was writing it — I had to get my creative juices flowing first — but really didn’t expect it to hang my machine. I had to power off my machine and then revise the code once my system came back up. After some initial tweaks I had this (thinking that it was just too much to read all that into memory, and failing to see the real problem for what it was –that nested for loop):

import fileinput

INFILE1 = 'all-email.lst'
INFILE2 = 'nofp-list.txt'

destinations = []
dupes = []

for line in fileinput.input([INFILE1]):
    for line2 in fileinput.input([INFILE2]):
        if line != line2:
            print line2
            destinations.append(line2)
        else:
            dupes.append(line2)

This didn’t work either as it was giving me an input already open error. Rather than investigate further, I turned around and ultimately ended up with this, and thought to myself “that was dumb I KNOW better than that”:

INFILE1 = 'all-email.lst'
INFILE2 = 'nofp-list.txt'

destinations = []
dupes = []

list1 = open(INFILE1, 'r').readlines()
list2 = open(INFILE2, 'r').readlines()

for line in list2:
    if line in list1:
        dupes.append(line)
    else:
        print line
        destinations.append(line)

*DUH* read them both into a list and use python’s “in” syntax to look for one in the other. Done. Now, I’m positive there’s even more iterations of this code that it could have eventually evolved into but this got the job done and didn’t suck my machine’s resources. Since it was a one-off — I stopped here. (NOTE: the code might not be exactly as I had it since this is largely from memory.)

In any case — its situations like these that I both despise and enjoy at the same time. I despise it because it should be easy and am embarrassed by the fact that I my first revision was so *dumb* as if it lacked any thought. But I enjoy them because its a finite problem to solve and allows me to exercise my brain a bit. Being someone who is starting to spend less time coding and more time in meetings, it feels good to do these exercises.

CLEARLY I’ve uncovered the need for me to code up a better process for doing these targeted blasts as it would appear they are going to be doing more and more of them. I look forward to writing that code so I don’t have to go through something like this that should have been so very elementary! * Embarrassing!*

I invite you share your approach for such a situation, A) so I can learn more from it and B) to find out if I’m really that far off anyway…

:wq!