Three Important Lessons for New Coders

I’m told there are workplaces where large, skilled software engineering teams follow textbook development processes and efficiently produce reams of high-quality code. I’ve never worked at a place like that.

The rest of us are stuck dealing with:

  • small teams lacking practical software engineering experience
  • big, ugly legacy code bases
  • having to write code in between meetings, PowerPoint, and other job responsibilities, and
  • engineering processes ranging from “loose” to “non-existent”.

This is a dangerous world for a new coder. It’s easy to develop bad habits. Ultimately, it’s up to you to keep pace the larger software engineering community, and this extra effort pays off when you look for your next job or project. There are tons of books and blogs about software engineering. But here are the three most important things you can do to improve your value as a software engineer:

1. Write automated tests for your code.
Every language has one or more popular testing frameworks (like JUnit for Java and Nose with unittest for Python). Back in college before I learned to write unit tests, I would write a quick script or modify the application to print some debugging information. That’s how I would verify that some new function or class was behaving properly. Unit testing formalizes this process. The biggest benefit is that you build up a collection of these tests that you can run again later. This helps squash bugs caused by new code breaking older code. Even if no one else on your team is writing automated tests, you can and should write test cases for your work.

2. Use revision control for your source code.
Source code revision control systems let you commit code changes to a repository and preserve a revision history as your project evolves. They’re pretty much a necessity when coding with a team. It doesn’t matter which one you use, but be warned that a lot of developers are seriously attached to their system of choice. Subversion is really easy to learn, so I’d suggest starting there. Git is what I use now, but anyone that tells you it’s easy to learn is a lying liar. A less obvious benefit of using source control is that you can delete old code without worrying. You can always get it back from the revision history. Uncluttering your project makes it easier to work on.

3. Learn some important design patterns.
Breaking down your project into a class hierarchy can actually be pretty fun. There are lots of design patterns you can follow to decompose your problem. I’ve found these two to be the most helpful for small projects: Composition (often a better option than traditional inheritance taught in introductory programming classes) and Dependency Injection (DI). Despite a scary-sounding name, DI is a very simple design pattern that helps you abstract out dependencies on libraries and services. If you’ve ever had to change the database you’re using, for example, you’ll immediately see why this is a great technique.

And one last bonus tip: write less code! The great Edsger Dijkstra teaches us to think in terms of “code spent” rather than “code produced”.

Cloud all the things

Well, now I’ve seen it all. I saw an ad for this new product on my Facebook feed today. Western Digital put an ethernet port on a portable hard drive and called it a “personal cloud”.

Remember the date. November 19, 2013: the day “cloud” stopped meaning anything.

WD's 'personal cloud'

3 Themes from HPEC 2013

Last week, I attended the annual IEEE High Performance Extreme Computing Conference (formerly called High Performance Embedded Computing) in Waltham, MA. I had the privilege of presenting my paper on distributed database performance, and I got some great comments and questions.

Here are the key themes that came across in many of the talks and keynotes:

1. Hadoop MapReduce is entering the “trough of disillusionment”
The MapReduce programming pattern is inadequate for all but the simplest of analytics. On top of that, the Hadoop implementation of this classic model of parallelism is bogged down by a weak scheduler and inefficient middleware. Looking ahead, it can serve “embarrassingly parallel” applications, but future versions need to address some of the performance problems.

2. The next generation of intelligent analytics rely on sparse computation
In years past, the HPEC community was laser-focused on signal processing and accelerators such as FPGAs and GPUs. The engineer’s goal was to squeeze every last FLOP out of a computing system. This year, about 10 talks dealt with applications of sparse matrices and data structures. This is a major shift.

3. Tomorrow’s chips are going to look a lot different
Intel’s latest commercial chips are 22nm. Moore’s Law will get transistors down to 5-10nm. After that? This community has to innovate. Talks from Intel, Texas Instruments, MIT, Carnegie Mellon and others mentioned tricks like 3D chip stacking, making memory smarter, and other novel architectures as ways to cram more transistors on a chip, move data faster, and accommodate future applications that look nothing like the dense computation for which today’s systems were optimized.

Reach out on Twitter or leave a reply below if you noticed other themes!

Checking In

Wow, I haven’t posted in a while. I’ve been busy writing… just not here. I’m fortunate to have some upcoming publications for 2013, and they’re all on very different topics:

Accepted: S. M. Sawyer, B. D. O’Gwynn, A. Tran, T. Yu. “Understanding Query Performance in Accumulo.” IEEE High Performance Extreme Computing Conference (HPEC ’13). Waltham, MA. September 2013.

In Press: K. Ni, N. Armstrong-Crews, S. M. Sawyer. “Geo-registering 3D Point Clouds to 2D Maps with Scan Matching and the Hough Transform.” ICASSP 2013. Vancouver, Canada. May 26-31, 2013.

In Press: D. Whelihan, J. Hughes, S. M. Sawyer, et. al. “P-sync: A Photonically Enabled Architecture for Efficient Non-Local Data Access” 27th IEEE International Parallel & Distributed Processing Symposium (IPDPS ’13). Cambridge, MA. May 21, 2013.

I’ll be sure to post links to the papers on my publications list when they’re available online. And I hope to be adding a couple more to this list soon!

Town-Wide Garage Sale Web App

The Borough of Madison, NJ organized a town-wide garage sale to benefit Union Beach, a community hit hard by Hurricane Sandy. I added a feature to their website, so residents can register their sale and (pending a quick review by a website administrator) have it placed on a map of Madison. Check out the map here!

Madison Garage Sale Map

Screenshot of Madison’s town-wide garage sale map mashup

I built the town’s extensible content management system starting in 2008, working closely with Jim Sanderson, Madison’s technology directory. The history of “RoseNet” (named for Madison’s “Rose City” moniker), one of the first community websites, dates back to 1997. It has gradually grown to include more and more content and interactive features. I haven’t seen any other municipal websites that come close in terms of quantity and quality of data. For example, Jim has invited all local businesses and non-profit organizations to have a self-maintained presence on the website, and this cool business map is a fun way to promote Madison’s downtown merchants. The local business listings provide a helpful service to residents and help boost revenue by promoting the business district in the face of competition from internet and big box retailers. There is, of course, room for improvement (we’re constantly debating the home page and navigation), but the site has proven to be a very effective communication medium for the town.

The garage sale registration process highlights a flexible workflow capability that will help the municipal government streamline many forms and processes. Plus, thanks to years of careful design and execution, Jim and I were able to deploy this feature with minimal cost and lead time.

I think we’ve really stumbled onto a great architecture for a civic website. I’d love to turn the design into an open source project. Recently, I’ve heard a lot about civic code volunteering and startups (like the Code for America project), and I think that’s really great. Even though Madison is a small town, I think we’re out in front in terms of designing an effective municipal content management system. Drop me a line [Twitter, Google Plus, or just leave a comment below] if you’re interested in collaborating.