Stop eating spaghetti code

Date: 4 Oct 2016
Reading time: 22 mins

Index

The people’s problem
- Communication
- The Engineering Process
  - SCRUM
  - Leadership vs Boss
The technical issues
References

if else if else if else if else: we live in a world which is just too much conditional sometimes. If it rains, I open the umbrella. If I am hungry, I eat. Else, I do not. The if-else constructor is easy, simple to use, but also very simple to misuse. Recently, I have come up on this interesting article on “The Daily WTF” [1]:

 if(decoded.tags)
 {
    if(decoded.tags[1] == "ok")
    {
       var _loc3_:* = decoded.tags[0];
       if("getcashgames" !== _loc3_)
       {
          if("lobby_subscribe" !== _loc3_)
          {
             if("getcgtickets" !== _loc3_)
             {
                if("gettourtickets" !== _loc3_)
                {
                   if("getflopsseen" !== _loc3_)
                   {
                      if("getbonuses" !== _loc3_)
                      {

MORE CODE LIKE THAT :/

Often, especially in the academic world, we care always less and less about code quality and more about features, outputs. What we often do not realise is that features and code quality are very related to each other.

In this article I will talk about my experience in “getting the shit out of code mess”. This is not going to be the ultimate definite guide on how to make good software or similar. There are two main approaches on solving coding and architectural mess. The first one is to understand how the team works and whether there are problems within their engineering process. “Bad software architecture is a people problem” (Kate Matsudaira) [2]. The second approach involves the technical issues and actually solving them.

Notes:

I have been working on various projects (open source and proprietary). I will not refer to any of them in particular. However, the problems, challenges and solutions provided below are inspired on real case studies.
There are many more aspects regarding the art of making a good and successful product than the ones listed here. One, for instance, is marking. Nonetheless I won’t talk about marketing or business aspects of a software artifact.
This is a very opinionated article.

The people’s problem

People have problems. Always. Machines do, so why not people who are definitely less binary. What these problems often come down to are: communication and the establishment of a proper engineering process. Whoever might not agree with this statement, probably does not have proper experience working with a team of more than one or two people.

Communication

There is a general assumption that computer scientists and/or programmers are not very good at communicating. While there might be some truth in this, it is also true that most, if not all, of the world-leading computer scientists are excellent communicators. For a thing, knowing English well is important not just to read answers on stackoverflow, but also to communicate your ideas with people who do not speak your language. And this happens often in the world of today. However, there are many more levels of communication that matter in making the best out of your team and project. There are three types of communication:

face-to-face
via email
via messaging systems

As a developer and CTO, I experienced challenges using all three types of communications. Face-to-face communication is a really good way to get information out of someone. It is also good to communicate your ideas. Based on the role you play, you must communicate with the relevant parties in order to understand requirements, problems, direction of the project or technical issues. In one of the projects I was involved in, this was often not done and the consequences were poor quality and inconsistency in the behaviour of the product. So do talk, face-to-face.

The other two forms of communication, email and messaging systems, might sound similar, and they are, but they are also very different. Emails work very well when communicating with members of the team who are not active on a daily or weekly base. They also do work well when communicating with people that are outside of the project. Nonetheless, using emails for frequent conversations can be problematic. In this other project I was involved in, we were 4-5 part-time developers and one part-time manager. Being all part-time is already challenging. This often means that the team cannot meet all at once. We had meetings involving two or three of us, but almost never the entire team. Emails followed the same pattern. Eventually, there were conversations like:

A: "As I was saying this should happen on BUGA V1.1"
B: "But this is what the requirements says"
A: "Yes, but we said last week that we need more replication"
B: "What replication? I did not know about it"
C: "Actually, A, there was on update. We must do X,Y,Z".

In the above conversation there are two problems that arise. (I) team members do not seem happy about the situation, since they are not well-informed on the latest facts; (II) A, B, and C do not really know what to do and this usually leads to inconsistency within the product (or between products). Our solution, in that case, was to use a messaging system (slack) to centralise communication. We actually started using slack to talk about random stuff too. This brought the morale up and good effects on the product.

In another project, there were multiple teams working on separate products. Some of the products exposed an API to be used by others. We thought of building the best possible solution for our service. The other teams, thought of doing the same. The communication among the teams was good, but not excellent. As a consequence, there was often the case of a team providing an unusable API or just the wrong service. Unfortunately, these issues do happen often and it is hard to avoid them (especially for products that want to use micro-services everywhere, because it is trendy and cool). Establishing good communication between the teams and writing well-written and consistent documentation helps a lot.

The Engineering Process

The engineering process is a huge topic. There are books and books on it. Here I will just list what helped me to move some projects forward.

SCRUM

If you do not know about SCRUM, I would suggest to read about it. It is, however, only one of many agile techniques to run a team. The most important thing, in my opinion, is not to follow the scrum process strictly. Instead, you need to understand the main principles of scrum, the requirements of your team and adapt to it. You should also be tolerant and opened to changes in the process. In our case, we were all working part-time and having daily meetings was just not feasible. We did however had sprint-planning meetings, a tasks board, use of version control and code reviews. Having just sprint-planning meetings, and very few daily meetings, was great. We focused more on doing, rather than on the talking. And whenever communication was needed, we used slack. Using a tasks board, like Trello, was also great. adding a task is just a few clicks away and even our manager, who is not a computer scientist, could do it. The backlog grew huge, but the tasks were focused and small. As a result, the productivity of the team had increased significantly.

What I really like about SCRUM is that the focus is always on small tasks. This is good for the morale of the team and it adds flexibility to adapt to new requirements. None likes reverting code. Especially if you have worked on that feature for weeks. However, if the task is small, reverting is not going to be that painful anymore.

Leadership vs Boss

I have seen a lot of posts, on Facebook or LinkedIn, about the differences among the two. Actually, there is some fundamental truth in those infographics.

A boss, uses people. A leader, develops them.
A boss, says "I". A leader, says "we".
A boss, takes credit. A leader, gives credit.
Etc.

Mark is the CTO and project manager of the startup he has founded. Being a small company, it is only four of them, including the CEO/CFO. Mark is a good programmer, but being the CTO he is always busy in meetings and telling Jon and Clare what to do. Both Jon and Clare are young programmers and have hard times understanding the existing code base and Mark’s vision. This leads to Jon and Clare producing poor code and an application that does not reflect Mark’s ideas. Mark has been acting as a boss and when he realises that Jon and Clare need help, he has serious difficulties understanding the underlying architecture (which has drastically changed since he looked at the code months earlier). He also finds out that Jon and Clare have not been doing any reviews, since there was none who would review their code.

The problem in this scenario is that Mark is the CTO and project manager of a small startup, but he still has to learn how to properly wear multiple hats and be a leader for the developers. That means, he has to manage them but also sit down with them and do some coding. When Mark decides to do this, Jon and Clare get motivated, perceive Mark’s vision better and can easily ask Mark about coding issues.

As a leader, you should be the first one to believe in your product and your team.

The technical issues

In the end, it is all about solving the technical issues within your software. Solving the people’s problem is the first step. This second step is equivalently important. This is not a computer science 101, however I will suggestions on what I believe to be good ways to improve the architecture, reliability and robustness of your product.

Automate your project

Use build automation tools (e.g. maven, ant, make, etc) when creating your project. If you do not use them, then you should. This was one of the first steps we took when the team grew. It was already hard to exchange the dependency files among two people. It became crazy, when we were 4-5. In our case, we used maven and this saved us a lot of pain and time. My recommendation is to always start from a maven parent pom, and build your project as a child. This pattern will allow you to add modularity over your project. For instance, you could decouple a part of a project and make it as a new child. This goes in favour of code reusability.

To give a simple (and ad-hoc) example, we would not have projects that would look like this anymore:

BUGA/
├── src
│   ├── webapp
│   ├── java
│   ├── test
│   ├── resources
├── resources
│   ├── upload.jar
│   ├── jane.jar
│   ├── obscure.jar
│   ├── third-parties
│       ├── another-jar.jar
│       ├── stop-it.jar

But we would rather have a more modular and structured project:

BUGA/
├── bugga-core
│   ├── pom.xml # child
│   ├── java
│   ├── test
├── bugga-rest-api
│   ├── pom.xml # child
│   ├── java
│   ├── test
├── bugga-webapp
│   ├── pom.xml # child
│   ├── java
│   ├── webapp
│   ├── test
├── pom.xml # parent

Automation can go a long way. You can add tests on your automation process and push your project to an automation server (e.g. Jenkins or TeamCity). You will love steps that look like these:

$ mvn clean compile
$ mvn test
$ add/commit/push your changes

and then see your project go green on the automation server. You could also add the production step at the end of the pipeline, so that your users always get the latest version of what you are building.

Learn the language

This can go wrong in many ways. The worst possible case, is that you end-up with something like this:

<%while (!days[currentCal.get(Calendar.DAY_OF_WEEK)].equals("Sunday")) {
                        currentCal.add(Calendar.DATE, -1);
                    }
                    for (int i = 0; i < 7; i++) { //for each day of the week (Monday to Sunday) %>
        <%="<tr align='center'>"%>
        <%="<td>"%><%currentCal.add(Calendar.DATE, 1);%><%=days[currentCal.get(Calendar.DAY_OF_WEEK)]%>, <%=currentCal.get(Calendar.DATE)%>-<%=months[currentCal.get(Calendar.MONTH)]%><%="</td>"%>
        <%for (int j = 0; j < serv.cnst.calculateTimeSlots() + 1; j++) { //for each timeslot
                if (!serv.isAfter(currentCal, i, j)) { //if the timeslot time is in the past, show disabled checkboxes (checked if booked, unchecked if not booked
                    if (!serv.checkTimeslotAvailability(i, j, diff, currentCal.get(Calendar.DATE), currentCal.get(Calendar.MONTH), currentCal.get(Calendar.YEAR))) \{/\%/>
        <%="<td>" + serv.getUserWhoBooked(i, j, diff, currentCal.get(Calendar.DATE), currentCal.get(Calendar.MONTH), currentCal.get(Calendar.YEAR)) + "<br><input disabled checked onchange='document.getElementById(\"slotID" + diff + "\").value = checkToText(this," + diff + ")' id='"%><%=i + "-"%><%=j + "-" + diff%><%="' type='checkbox'></td>"%>
        </\%\}/ else \{/\%/>
        <%="<td>" + serv.getUserWhoBooked(i, j, diff, currentCal.get(Calendar.DATE), currentCal.get(Calendar.MONTH), currentCal.get(Calendar.YEAR)) + "<br><input disabled onclick='document.getElementById(\"slotID" + diff + "\").value = checkToText(this," + diff + ")' id='"%><%=i + "-"%><%=j + "-" + diff%><%="' type='checkbox'></td>"%>
        </\%\}/
        
A LOT MORE CODE HERE

This was some legacy code we found in a project we agreed to maintain on behalf of someone else. The worst decision we could have ever made. The problem, there, was that the developer did not know what he was doing. Just bashing some html, js, java and jstl all together is not enough to make the code work. Code has to be maintainable too: easy to understand, refactorable, reusable.

Learning a language properly is really important. Often, we are thrown in a new project and have to learn a language quickly. This is when we make common mistakes and introduce bugs without knowing about it. Lets consider this case (java):

String updateString =
    "update " + dbName + ".COFFEES " +
    "set SALES = ? where COF_NAME = ?";

try {
    PreparedStatement updateSales = con.prepareStatement(updateString);

    for (Map.Entry<String, Integer> e : salesForWeek.entrySet()) {
        updateSales.setInt(1, e.getValue().intValue());
        updateSales.setString(2, e.getKey());
        updateSales.executeUpdate();
    }
} catch (SQLException e ) {
    e.printStackTrace();
}

The developer has used the PreparedStatement correctly, avoiding some potential SQL injections. However, he never closes the PreparedStatement and the Connection with the database. This is a common issue in java (when using streams) and all sort of languages and code where you need to manage the resources yourself.

So we have been taught to just add a finally to our try-and-catch and all is well.

} finally {
    if (updateSales != null) {
        updateSales.close();
    }

    if (con != null) {
        con.close();
    }
}

Well this might work, but you will always end-up in the case of forgetting to add that finally. If our developer, instead, had read a wee-bit more on Java 7, he would know that he could just use a try-and-catch with autocloseable.

try(Connection con = getConnection();
    PreparedStatement updateSales = con.prepareStatement(updateString)) {

    for (Map.Entry<String, Integer> e : salesForWeek.entrySet()) {
        updateSales.setInt(1, e.getValue().intValue());
        updateSales.setString(2, e.getKey());
        updateSales.executeUpdate();
    }
} catch (SQLException e ) {
    e.printStackTrace();
}

The more leaks you stop (or avoid), the more stable is your product. Who loves having a 500? None. And not your clients for sure. If you are a Java (or OOP) programmer, you will love reading “Clean Code” by Robert Martin or “Code Complete” by Steve McConnell. I always keep these book on my desk.

You should always try to have coding standards within your team. You can use or get inspired by the Google Style Guides [5], the Mozilla Coding Style [6], or the NASA JPL Coding Standards. Just do not wait for someone else to do it. This also goes back to the communication topic discussed above.

If you are new to the team, there might already be some used standards. Whether written or not, ask. If you do not want to annoy your colleagues every 5 minutes, then have a look at the existing code and keep it consistent.

Code reviews are also a great way to create standards within your code and architecture. Personally, I also found reviews to be the best teaching tool to learn how to program properly. Well, always make sure that the person reviewing your code has a good amount of experience.
Also, when you do a code review you learn the code written by others. This is a good way to share knowledge within the team.

Modularity

Modularity, as we have seen in the automation section, is good. You can achieve modularity as many levels. In the example above, modularity was introduced at the project level. This type of modularisation allows you to abstract code at a very high level. You could, for instance, extract a library out of your project and re-use it somewhere else.

Modularity, however, is useful at lower levels too. There are many reasons to write modular code: cleaner code, better architecture, less code (because of re-usability), separation of concerns, better consistency (because you have less code!), substitutability, and re-usability. General suggestions are to keep classes and methods small and focused. For instance, do not mix the business logic of your product with its lower level logic.

Always remember, however, that the application logic should be respected. What does it mean? It means that you shuld not sacrifice your architecture for the sake of having as less code as possible. Your architecture has to reflect the application logic.

Learn the tools

Do not let your tools slow down your productivity. Learn to use a terminal, its tools (e.g. cat, vim or emacs, less, ssh, cp, mv, etc). Learn to use an IDE, if you like using one. Learn to use a debugger. And learn to use any other tool that is necessary in the process (e.g. docker, tomcat, maven, git, etc.).

Understand and Use Version Control Systems

Version Control Systems (git, hg, perforce, bazaar, etc.) are not just fancy tools. They are necessary tools. If more than zero people work on the project, you should use it. Learn the tool used by your team, understand well how they use it too. I often see people using git or mercurial, but all they do is: add-commit-push, over and over again. It is important that you understand what a branch is, how you can rollback to a specific revision, what it means merging. I also highly recommend reading a bit about version control workflows [3].

Testing

Run tests. Always. This is the hardest principle to get into practice. The main issues with tests not being written is the lack of labour, especially when you urgently need to deliver as the deadline comes closer. If this is the case, all it takes is some guts and the ability to push back the deadlines. Give yourself enough time to add automated tests. Do a lot of manual testing too, not everything can be automated. If you cannot push back the deadline, then make your app simpler. Release less features in the upcoming update. It is always better to have a simpler and robust product, than a super-duper-advanced app that works well often, but not always.

Often, I hear people complaining that testing is just too hard and expensive. I have to setup my entire App just to test some small internal methods. Testing is stupid and useless!. When these types of issues arise, do remember to apply dependency injection. Why? You might ask. Well, the dependency injection pattern can be applied to legacy code, without the need to rewrite much of the application, and allows you to write tests in isolation via mocking (mock objects are simulated objects that mimic the behavior of real objects in controlled ways - Wikipedia).

Understand your App

Make sure that you understand what you are building. Whether you are the CTO, project manager, or the newest developer in the team: understand the architecture and the requirements of the project.

Remember the project that had code like this?

if (!serv.isAfter(currentCal, i, j)) { //if the timeslot time is in the past, show disabled checkboxes (checked if booked, unchecked if not booked
    if (!serv.checkTimeslotAvailability(i, j, diff, currentCal.get(Calendar.DATE), currentCal.get(Calendar.MONTH), currentCal.get(Calendar.YEAR))) \{/\%/>

This was a JSP application run over Tomcat. The developer had problems understanding the requirements and lacked some basic theoretical principles. After hours of looking at the code, we found out that when the admin added a new user, the UserManagement servlet cached the user and then pushed the INSERT statement into the database. On reading, however, the servlet would never make a SELECT query on the database, but look at the cache. The wrong one, however. Yes, because having two caches of the same type within the same servlet is a common pattern. NOT! This resulted in the admin never being able to see what users had been added. Consistency must be taken seriously. If you app does not server thousands of users, remove the cache (all of them) and just talk to the database.

Self-criticism

One of the most common problems among youngsters is to love their code. You should not. Always look at your code as a stranger and keep refactoring it, improving it. Obviously, have a good balance between refactoring and moving your features cards from the TODO list to the DONE list. If you are developing the front-end of a website, wonder what could be done better. Moving a button to the right? Use different colors? Use icons over text? Add a Tour feature of the app? Asking others helps. You might also add a feedback feature in your app and get feedback from your users. All feedback is good, give priority over tasks and discuss any issues with your team.

Keep It Simple, Not Necessarily Stupid

Last week I was reading this short story title “It is the Future” on CircleCI Blog [4]. Two developers discuss how a simple app should be architected and deployed:

Cool. I’m just building a simple web app at the moment - a normal CRUD app using Rails, going to deploy to Heroku. Is that still the way to go?

-Oh no. That’s old school. Heroku is dead - no-one uses it anymore. You need to use Docker now. It’s the future.

Oh, OK. What’s that?

-Docker is this new way of doing containerization. It’s like LXC, but it’s also a packaging format, a distribution platform, and tools to make distributed systems really easy.

Containeri.. — what now? What’s LXE?

-It’s LXC. It’s like chroot on steroids!

What’s cher-oot?

-OK, look. Docker. Containerization. It’s the future. It’s like virtualization but faster and cheaper.

Oh, so like Vagrant.

-No, Vagrant is dead. Everything is going to be containerized now, it’s the future.

More in the blog...

In the end, the best solutions was to develop an app using Ruby on Rails and deploy it on Heroku. That, however, does not mean that you should not know about docker, kubernetes, log unifiers, replication, distributed file systems, the CAP theorem, or Redis. It is, in fact, important to know about all these techniques and technologies and more. This will help in making the best choices when building your product. Are you building an app for a million users or for 200 of them? How much data do I need to store? How sensitive is the data? Ask yourself these questions, use your knowledge and make wise decisions. Not knowing about something is okay. If that is the case, ask someone else or read about it. The internet is really big. You can find guides and books about everything. “The Architecture of Open Source Applications” [7] is a good book to learn about complex projects.

References

[1] It takes one function - The Daily WTF - http://thedailywtf.com/articles/it-takes-one-function

[2] Bad software architecture is a people problem - Kate Matsudaira - Communications of the ACM, 09/2016, Vol.59 no. 09. DOI: 10.1145/2956641.2974011

[3] Comparing Workflows - Atlassian - https://www.atlassian.com/git/tutorials/comparing-workflows

[4] It is the future - CircleCI Blog - https://circleci.com/blog/its-the-future/

[5] Google Style Guides - https://github.com/google/styleguide

[6] Mozilla Coding Style - https://developer.mozilla.org/en-US/docs/Mozilla/Developer_guide/Coding_Style

[7] “The Architecture of Open Source Applications” - http://aosabook.org/en/index.html