They come to us at the last minute, and then despair that we don’t have any testers available and also wonder why they don’t have any automated testing set up!
They come to us at the last minute, and then wonder why the security review is going to delay their application’s deployment!
They scheduled their performance testing for the end, and then wonder why they have no time to fix the problems that were found!
They ignored the enterprise architect’s advice and designed their app with no regard for enterprise standards, and then wonder why people are upset!
They talk about continuous delivery, but we are the ones who have to operate their app, and they have not designed it be easy to operate, and so they wonder why we are having trouble restarting instances!
These are common laments from “functional managers” in large organizations that are trying to go agile. Functional managers are those who are responsible for traditional development lifecycle “steps”, such as enterprise architecture, security, testing, quality assurance, release management, deployment, operations, and other “silos” of a typical legacy IT organization. Agilists often blame these silos for the problems, but that is not helpful, and in fact agilists are brought in to help to streamline things, so they need to have answers about how to replace these functions with their agile equivalents.
Replacing functions is a huge step, however. A great deal of learning is required for an organization to go agile, and most organizations need intermediate steps. One of the most effective intermediate steps – one that fosters the required learning – is beefing up the release planning process, to make it more inclusive. In other words, invite each of these silos: get them in a room and talk through the issues, so that nothing is left for the end of the release when there is no time left. Allow people advance notice of what is coming, so that they have time to manage their own processes and their own workloads.
Busy functional managers will not always have time to attend each project’s release planning session. They will want to send someone on their behalf – one of the technical staff. That means that you need to give them advance notice of a release planning session and what the project is about, so that they can see who is available and send the right person. You also should tell them that it will be optimal if that person can be the one who continues to work with the team throughout development, as needed: that might affect who the functional manager sends, based on that function’s resource work schedule.
Not everyone needs to be in a release planning meeting from beginning to end. A release planning meeting can take from an hour to several weeks, depending on whether the project is new and how complex it is. If you expect it to take more than a few hours, it is best to plan which issues will be discussed when, and invite the right functional representatives for those time slots: they will be appreciative. However, there might be some topics for which you want everyone there. More on that in a moment.
Release Planning TopicsSo what happens during this type of release planning? What things should you talk about?
The answer is conceptually simple: talk through all of the top level issues that affect what will be done, who will do what, how it will be done, and how things will work. I call these the conceptual pillars:
1. The core vision and high level requirements.
2. The team roles and collaborative processes.
3. The core design concept.
4. The development pipeline.
5. The testing strategies.
6. The external dependencies.
Most agile teams focus a great deal of attention on #1 – the release backlog. That is indeed the foundational element: it is what you will be building. So I am not going to say any more about that: everyone reading this knows all about it. There are books on how to do it, and it is part of core agile training. Teams also generally spend time on #2 – team roles – but not sufficiently, so I will discuss that below.
The rest of the pillars are things that teams often miss. These things are really important for continuous delivery (CD). Without nailing these, you will have a hard time getting to CD: you will find things not working well. Let’s take them one by one.
#2: Team roles and collaborative processesMany teams come up with a list of team roles beyond the fairly common Scrum roles of Product Owner, Scrum Master, and Team Member. The additional roles accommodate the “silo” processes that are imposed by the organization. Remember, baby steps ;-) Such “extended team” roles might include test programmer, acceptance tester, QA analyst, security analyst, agile coach, project manager, tech lead, enterprise architect, release management liaison, data center engineer, and so on. The point of this discussion is not to get into what each of these roles might do and whether it is needed: it is assumed here that the organization currently requires these roles, and the discussion here is how to accommodate them so that everyone can do their job in the most agile way possible.
Agile teams make a-lot of decisions in ad-hoc discussions. The problem is, when there are extended team roles such as those listed above, those roles are easily excluded from the ad-hoc discussions, yet these discussions often impact those functions. It works the other way too: these extended team roles often make choices that affect the team. All these call for more communication: the extended team and the immediate team (development team) must collaborate on an ongoing basis, as needed. During release planning, it is important to discuss this important issue with each of the extended team roles, and collectively decide what the best method is for collaborating in an ongoing manner and keeping each other up to date. Simply asking everyone to join the standup might not be the best way: there might be too many people, and they also might not all be available at the standup time. Work out what makes sense. I am not even going to propose an approach here, because there are so many ways and each functional area will have such different collaboration needs.
#3: The core design conceptThe Scaled Agile Framework (SAFe) talks about “architecture runway”. Scott Ambler has long talked about “agile architecture” and “agile modeling”. Feature Driven Development (FDD) talks about the importance of modeling the business at the start of a project (great explanation here) to establish a shared conceptual understanding of key data and relationships. (Note: I was on the Singapore project that the article refers to.) The importance of having face to face, early discussions about the primary models and design patterns is extremely important. This is not “big design up front” (BDUF), in which too many details are figured out too early. Instead, early discussions about models and design get everyone on the same page, using “broad brush strokes”. This greatly catalyzes future discussion and increases the “emotional intelligence” of the team as a whole. It is about Peter Senge’s Five Disciplines: establishing and exchanging models of understanding and working to resolve those.
Up front high level modeling and analysis also informs decisions about what kinds of testing will probably be needed, what components and services will probably be needed, what the major interfaces of the system will probably be, and who needs to be involved when, because all of these choices vary with different technology stacks and different design requirements. And I emphasize the word “probably” in all this because up front decisions are always subject to change: they are merely a starting point.
There is no better time to establish the architecture runway than at the start of the project (or release), in an all team discussion, including those members of the extended team who might be affected. For example, some technologies are easier to deploy using automation than others, and a data center representative should be present for discussions about how the design might affect deployment – that is, if continuous delivery is important to you, and if the development team does not have direct access to the production deployment environment.
#4: The development pipelineThe “pipeline” is the end-to-end process from the Product Owner’s conceptualization of features through deployment, operation, and maintenance of an application. It even extends farther than that, but this scope is fine for this discussion.
Defining the pipeline consists of talking through and deciding what will happen when, why it is happening, how it will be done, who will do it, where it will occur, and what the acceptance criteria are. It extends to everything that is involved in the planning, creation, release, and operation of the software. Not all of these decisions need to be made up front, but it is crucial to get everyone together and agree on the basic outlines of the process, and identify decisions that still need to be made and who “has point” on those decisions. This is about designing the continuous integration and continuous delivery process as a whole, looking at it as a system.
When defining the pipeline, make sure that you also define how progress will be tracked for every aspect of the pipeline. The software development team uses agile work management tools such as a story board, which makes its progress very visible. There needs to be equivalent visibility for the work of the extended team, aggregated in a manner that the development team, as well as management, can see at a glance what the rate of progress is (“velocity”) and what is holding things up (“blockages” or “impediments”). It might be hard to aggregate this information because each silo area generally has its own work management tool, but this aggregation is perhaps something that the project manager can do, if there is one. Alternatively – and this is more agile – each extended team member can update a shared task board, possibly on a project wiki, that tracks the external tasks that the development team is depending on.
#5: Testing strategiesAgile teams know a-lot about testing. Agile turns testing into programming: testing is replaced by test programming. This is not new – I did lots of this during the 1980s and I am sure many others did it way before that – but agile emphasizes the importance of it, as an enabler for continuous integration, whereby code is not checked in until it passes all the tests, and this is done frequently – often many times a day.
There is a gap here though. Most agile testing practices focus on functional testing. Continuous delivery requires that we expand that, to include every kind of testing: failure mode testing, scalability testing, security testing, and whatever other kind of testing is required for production deployment. Also, many agile teams forget to plan for test data. If you will need complex test data from a business area, invite that business area to the release planning session and pin down how the test data will be created and when it will start to be available.
The Definition Of Done – RevisitedTeams often define a “definition of done” (DOD) that lists the criteria that all stories must meet in order for them to be considered “done”. The DOD usually specifies that all functional tests have passed. This is not easily extended to other kinds of tests, because non-functional tests are often not story-specific, and some kinds of tests, e.g., full performance tests, are not tests that you want to run many times during an iteration. We therefore need to expand the DOD concept in some way.
One approach that I have seen work is to have the DOD apply only to tests that are story-specific. These are generally acceptance tests. There needs to be an automated suite of acceptance tests to make this feasible, and they must be organized by story, built around the each story’s acceptance criteria. That is pretty common agile practice. Other tests that are not story specific should not be covered by the DOD, but instead should be run many times during an iteration. This can be nightly for some, or nightly for others if they take a long time to run. Integration tests fall into this category, and they are often run when code is checked in. Again, it depends on the duration of and resources required by the tests.
Some failure mode tests are story specific, and others are not. For example, stress tests are generally not story specific, and they should be run on a regular basis, but not necessarily every time code is checked in.
Security tests are especially problematic. You might even wonder, What are security tests? Many teams now add code scanning to their CI process. Code scanning is not enough though if you want to have secure code. To have secure code, you need to make an “assurance argument” for each feature that you design, and you need to verify through analysis that the assumptions of that assurance argument are met. (There is a good discussion here, but an “agile” approach to this should be less structured and more based on the programmer’s knowledge of secure design patterns. That is the primary topic of my book High-Assurance Design. For secure design patterns, see also the book Core Security Patterns by Steel et. al., and visit owasp.org for Web app security techniques.) Note that this is an analytical process – not a testing process. However, it can be turned into a testing process by testing that the assumptions of each assurance argument remain true. Tools such as Fortify can be used for this purpose. Not many agile teams do this today, but as security becomes more important, it is essential that teams start to learn these techniques, because security scanning will never be sufficient, as it catches only a small fraction of vulnerabilities. (See this article.)
Test CoverageTo achieve continuous delivery, you have to have confidence that your automated test suite is verifying that everything that presents a substantial risk is covered. Notice that I said “substantial”. Life is full of tradeoffs: continuous delivery carries the great benefit that you can push changes to users as quickly as you want to, but it does not eliminate all risk. Some things will slip through your tests, so you need to make sure that the risk is acceptable – not zero.
The question is, how do you know how complete your tests are? For functional tests, we measure test coverage. Test coverage needs to measure how completely the requirements (functional and non-functional) are met. For continuous delivery, the non-functional tests become just as important as the functional ones. But how do you measure “coverage” for security? For performance? For reliability? For enterprise architecture compliance? For maintainability?
The first step is to actually define those requirements. The next is to add them as acceptance criteria, either at a story level, or at a system level. The system level acceptance criteria apply to the tests that are not story specific. Coverage for performance means that all performance requirements are met. The same applies to each other area. Work with the extended team members to define what coverage should mean for each of their areas, and how it can best me measured or assessed, as automatically and repeatably as possible.
#6: The external dependenciesDependencies on events beyond the development team represent immense risks that are generally beyond the control of the team. I am not talking about dependencies on the work that needs to be done by the extended team, because we already already discussed that. Here I am referring to things that are beyond the control of even the extended team: things like the arrival of equipment from a supplier, the availability of a test instance of a system from a third party, etc. This is an area in which a project manager – if you have one – can really help: by persistently pursuing external dependencies.
RecapFunctional silos like this are not compatible with agile. Agile and continuous delivery work best if the development teams have full responsibility and control for every step of the solution development and delivery pipeline, rather than relying on external parties to perform isolated steps independently. However, for it to be possible for the development teams to have full control, they need to learn the many skills and concerns that are represented by each silo area. QA exists for a reason. Release Management exists for a reason. That learning takes time.
Eventually the silos can be converted into training and coaching teams that teach development teams about security, regulatory compliance rules, test automation, deployment, and all the other things that these silos currently do. Some of the silo functions will go away entirely, or will transform: QA can shift from checking documents to performing actual assessment of risk coverage by the automated tests – including the non-functional tests. Security can shift from scrutinizing security plans to teaching teams how to conduct threat modeling and how to use tools such as Fortify more effectively. Testing can change from providing manual testers to teaching teams how to set up and use test automation tools. The silos change from being functions that “do” to functions that “teach”. That takes time though: all those functions need to change to make that possible. They must learn about using more automation, they must learn about teaching and coaching, and they must learn about how development teams work. Setting up a collaborative relationship between the teams and the silos at the start of each project is a crucial first step.