Interview with Madhur Kathuria

Madhur Kathuria has coached nearly 300 teams for almost 75 clients across the US, Europe, South East Asia, Malaysia and Thailand. In this interview he talks about some of the cultural challenges for agile adoption. Read it here.

Interview with Elena Yatzeck

Elena was Chief Agilist for JP Morgan Chase Treasury Services and is now a VP of Corporate Compliance Tech. Find out how JP Morgan Chase reconciles agile with compliance and risk management demands. Read it here.

Monday, February 2, 2015

Interview: Dean Leffingwell

Cliff: Today I am talking with Dean Leffingwell, creator of the Scaled Agile Framework, commonly known as SAFe. Dean, can you please tell me a little about your background?

Dean: My degrees are in aerospace and biomedical engineering, so I see myself as a systems engineer dedicated to software. In one form or another, I’ve spent my entire 40+ years career being responsible for software and complex systems development.

Cliff: I can see the imprint of that: SAFe seems to have a systems view of things.

With regard to your origins, you created RELA and Colorado Medtech.

Dean: In 1977, I founded RELA, and later absorbed the publicly held company, Cybermedic. The melding of those two organizations resulted in Colorado Medtech, also public. That was my first 20 years as a CEO, we built complex medical devices and other fun stuff that included one of the coolest adventure rides on the planet for a major theme park. I became a software quality methodologist by necessity because we were building systems that could literally save people’s lives, or if defective, could kill them. Since then, the focus on software quality has been a driving passion that informs everything I do. One of the reasons that I so enjoyed the transformation to Agile—especially after starting with eXtreme Programming—was the intense focus on the quality of code.

My exposure to Agile came through XP first and Scrum second, and I saw two things that I had not seen earlier: in XP a set of courageous software practices that were technically sound, and in Scrum, a simple and lightweight project management method. I thought that the two applied together were really cool, but I immediately noticed a clash between the user communities. Technically, I never really understood the basis for the competition, because to be effective you have to have both approaches. But methodologists don’t agree with other methodologists.

I’ve also been involved in Lean throughout my career. I was chairman a company that made a lean version of MRP [Material Requirements Planning]. Colorado Medtech had a manufacturing capability, so I also cut my teeth on Lean manufacturing. I attended a workshop from Goldratt on the Theory of Constrains, learning from Goldratt himself. Lean helped save a division of our company that was critical to our ultimate success.

Fast-forward to the present day where we find ourselves building the world’s most complex systems with this incredibly robust body of knowledge that includes Lean, Product Development Flow – Don Reinertsen – Agile, XP, Scrum, and Kanban. SAFe integrates and builds on that pool of knowledge to help address growing systems complexity. And after all, what software developer today should not understand all of these aspects? That is one reason that you see some of heat around SAFe; we don’t see it as a zero sum game. It’s all good. In fact, the next version of SAFe will incorporate kanban for teams, in addition to Scrum and XP, bringing optionality, balance and integration of the best of the best to the team’s choice of approach.

Cliff: As you pointed out, there seems to be quite a division in the industry. I don’t think that is healthy. Being from the same time period as you, I see all these things as different schools of thought that overlap in terms of ideas. These recent approaches do seem to complement each other. There is a long view that many people need to see.

Dean: Your website reflects the larger perspective, which is why I agreed to invest some time in this interview. I don’t usually engage in defending the framework because it speaks for itself through the website, scaledagileframework.com, and the case studies. You don’t need a methodologist’s or service provider’s opinion about it, everyone can decide for yourself. There is nothing hidden.

For example, about a year ago, a blogger was criticizing SAFe. That’s fine; there are plenty of improvements to be made. I criticize it myself. But he wrote that he could not figure out what he didn’t like about SAFe and then he realized that it was because there were no people in it. To me, that is like looking at the Eiffel Tower and seeing no iron. When you look at the Big Picture and click through to the articles behind it, you will see more people than anything else.

To simplify SAFe, I’ll share a discussion that happened just yesterday with one of the world’s largest software companies. They have some enormous challenges in a really large program—involving 400–500 developers and stakeholders from many aspects of the business—and are looking to SAFe for help. Absolutely mission critical. I said that the first thing we could do is organize, train, and empower Agile teams. Second, communicate the mission, provide some UX and architectural guidance for consistency of purpose and usage, and then let them define it, design, it, plan it, build it, validate it, and gather customer feedback in a continuous series of two-week iterations. And finally, we facilitate largely face-to-face planning, feedback, JAD and problem solving sessions for the entire program, every ten weeks or so, on a fixed cadence. How could that not work?

“There are at least three hundred thousand

Agile practitioners using SAFe today.”

When you look at it in such a simplified form, you wonder whether criticisms of SAFe are based are on its fundamental constructs, or misconceptions, or perhaps the fact we are in a competitive marketplace for thoughts and services? SAFe is clearly a disruptive change to the industry. But SAFe is built on Agile teams. Period. There are at least three hundred thousand Agile practitioners using SAFe today, who were previously locked out. They were living in waterfall SDLCs. Just last week someone told me that there was a time when they could not use the word “Agile” in their company. Of course they use it now with SAFe. For many, their organization’s very survival now depends on it. We hear things like: “You’ve given us a wee bit of hope for this company,” and, “It used to be easier to write software that didn’t work than it was to change a requirement.” We changed that with SAFe.

That kind of personal feedback, along with the business results, are what motivate us every day. When we hear controversy about the method, we look at the measurable results companies are getting, see what can be improved, and move on to the next revision. And because SAFe is a work in process, we don’t get all emotional about it, rather, we learn and adapt. If I remember correctly, there’s an agile principle that speaks to continuous improvement, so I assume that applies to methods as well.

Cliff: One of the objections seems to be that SAFe is very prescriptive.

Dean: SAFe documents a set of proven success patterns. For instance, the SAFe Big Picture depicts teams iterating and delivering value every two weeks, and every so often—8, 10, 12 weeks—they check in with their end customers and larger stakeholders to validate the net accumulation of those iterations. Then they run a larger Inspect and Adapt workshop at the Program level to address the larger issues. Is that prescriptive? Sure, but can you imagine that you shouldn’t do that?

What’s more, XP and Scrum are very prescriptive as to how the teams do their work. Clearly, we need Agile guidance for people above the Team level, because it takes more than development teams to deliver end user value. Sometimes when you have a headache, maybe you need to take an aspirin. Might help with the pain.

Cliff: I think the push back is, why tell them that it has to be a certain number of weeks?

Dean: When someone interprets SAFe so literally it usually means they haven’t taken the time to click beyond the Big Picture to learn about the actual intention. For instance, if you click on “Develop on Cadence,” you’ll find the following: “ … while the Big Picture illustrates five sprints per program increment, that is arbitrary and programs can pick whatever cadence that best suits their abilities and context”. That’s guidance around a principle, not a prescription of how many aspirins to take.

And by analogy, can you fully understand a software system by simply looking at a sketch of the domain model? Obviously that’s not enough to reason about the underlying system. So it is with SAFe. The Big Picture is the domain model, but the principles and implementation lives deeper.

Cliff: It looks prescriptive if you don’t click through and read it. But what you are saying is that there is a lot of judgment in these things.

Dean: Absolutely. It’s just a framework.

Cliff: Is SAFe hierarchical?

Dean: There has to be a top of a picture and a bottom. Where does strategy come from?

Cliff: Strategy is inherently hierarchical because it is the outermost level of intent, in terms of what an organization is trying to do.

Dean: Strategy and investment funding comes from the top. The teams don’t pay their own salaries or decide what business the enterprise should be in. I think there is also a misconception about what is meant by the SAFe core value of alignment. Is “alignment” management telling the teams what to do and how do it? Or is it guiding the mission for the program? What’s the alternative: no mission, misalignment?

Cliff: I have actually heard from some camps a rejection of alignment. The objection seems to be about bottom-up versus top-down, and about self-organization versus coherence.

Dean: Therein lies the crux of the issue. A key principle of product development flow is that overall alignment delivers more value than local optimization. To achieve that alignment, empower teams, and speed value delivery, SAFe fosters decentralized decision-making under an umbrella of common mission and some architectural governance.

For example, the group I spoke with last week has about 400–500 people working on a platform that processes billions of dollars of revenue and needs a major revitalization. They will be launching three or four Agile Release Trains with some 50 teams. Don’t you think there has to be some guidance that says, “Here is what we need to accomplish? These are the features that drive the most important behavior. These are the common UX patterns for the user navigating. Here is our view of an architecture that will hold it all together.” Do we believe that 500 people can independently and emergently arrive at a common conclusion? If you answer those questions truthfully, you’ll acknowledge that the vision for the new platform must be driven by the overarching business strategy. And there have to be people responsible for that. People who shoulder the ultimate responsibility for success of the enterprise. And yes, they tend to live at the top of the organizational chart.

Cliff: Is there a sweet spot for SAFe, or a range of organizations that it is a best fit for?

Dean: SAFe was designed in real world context at places like John Deere ISG, BMC, Navteq, Nokia Siemens Networks, etc.; places where there are 300–1,000 practitioners that need to collaborate on their work. Last week I was at the Scaling Agile for the Enterprise Consortium in Brussels. There were a couple of Agile thought leaders on stage who, when asked about scaling, basically said “Don’t do it. Don’t scale agile. Don’t get that big.” Seriously, can you imagine the response from the enterprise, “Sorry, it’s too late, we are already incredibly successful, and we are already big.”

If teams don’t need to collaborate on a common mission, that’s a different issue. For example. If there are even a large numbers of teams building largely independent products, the level of governance in SAFe may not be necessary – though a common way of working may well be. But if you are building, say, a field crop combine with many hundreds of people involved, and there is a virtual rat’s nest of complexities and dependencies—the electronics, transducers, computer systems, actuators, the control system that gets fed from GPS to move the combine straight down the field, the engine control unit, real time vehicle service information and status reporting—well, you get the idea. That’s what SAFe is designed for.

And in the IT and ISV world, can a few agile teams build a significant enterprise class product these days? Should we compartmentalize the development of a such solutions into isolated teams, or should we build a team of teams that synchronize and work toward a common mission?

“We sometimes work with applications so large that it takes
multiple instances of SAFe to support it.”

To meet that enterprise demand, all of the sudden, there is a lot of energy going into various methods for scaling Scrum, and some are public about their belief that they are competing against a “big, one-size-fits-all framework.” That’s SAFe, of course, but I have news: we sometimes work with applications so large that it takes multiple instances of SAFe to support it. SAFe fits well for 8–10 release trains working together, but beyond that you’ll have a different level of problem of scale, and you’ll need multiple instances of SAFe. That’s one of the reasons we put Strategic Themes in 3.0, as a connector to other instances of SAFe, and to the enterprise’s overall business strategy.

That makes SAFe a highly scalable framework, but it is not designed to solve the problems faced by just a few teams looking to align their sprints. It is the larger enterprises that need SAFe, many of which are in the process of a SAFe transformation. If you name almost any ten Global 1000 companies, SAFe is already being deployed in a number of them.

Like with any disruptive technology, there is an adjustment phase that will come more easily to some than others – think S-curve ‘early adopters,’ ‘early majority,’ etc. When SAFe is deployed for the first time, it can feel top-down to the Scrum coaches who are coaching the teams, because with SAFe their teams need to be aligned on a release train. Is it worth it? The results say so. And the teams quickly get into it as they realize they are empowered to contribute to the larger value. Working on a team of teams that is delivering enterprise value faster is simply more satisfying for all. SAFe is successful in the market for only one reason, it works. Check out the case studies pages [here] for the objective measures of the value of SAFe. Simply, winning is more fun.

Cliff: What got my attention when I first discovered it was the picture. It was the first picture that filled in the pieces of the puzzle.

Dean: Didn’t it make sense to you when you saw it? Although if we both looked at that very first picture now, I’d be a bit embarrassed. It looks like Fred Flintstone might have drawn it. Oh, I guess that was me. But SAFe evolves. Version 4 will be out this summer.

Cliff: It does make sense to me.

What are the reasons you would have multiple instances of SAFe? Is it because of different portfolios? Different sources of funding?

Dean: It is typically driven by the different value streams, business units, and operating budgets. In a really large business, say a 25B company, each business unit may have a few hundred million in revenue, and each business unit will invest a percent of that in IT and software development. But because of the organizational challenges of managing large numbers of practitioners, and because many are working in largely separate domains, they tend to naturally fall into pockets of 300-500-1,000 people, each with an instance of SAFe.

Cliff: Do you end up with a SAFe steering committee that oversees the multiple SAFe instances?

Dean: Well that’s up to the enterprise architecture and enterprise portfolio strategy, currently a bit outside the scope of SAFe. And in any case, they wouldn’t be steering SAFe instances so much as they would be defining and coordinating portfolio investments that business units use SAFe to realize.

The root cause of much of this debate comes down to a discussion about who decides what gets built. If 3-5 teams are working together in a domain they know, can they largely determine together what gets built? Probably. Would 100 development teams in a global healthcare company be the ones to decide if the company should enter a different market, and provision teams to address that opportunity? Of course not. Strategy and investment funding is a centralized concern.

Cliff: Teams are not always aware of the long discussions and analysis that take place before that point.

Dean: And perhaps we leaders cause some of that lack of visibility when we fail to have a systems, rather than a parochial or functional view – when we mandate waterfall SDLCs, when we fail to communicate a clear strategy and compelling mission, and worst of all, when we overload the teams with unrealistic and unachievable commitments. That’s why the SAFe model depends on Lean-Agile leaders, and the emphasis on taking a systems view, implementing flow, and empowering teams by constantly communicating vision and strategy.

Cliff: One hears a lot in the Agile community about culture, and being Agile versus doing Agile: What role does mindset of leadership style play in a SAFe implementation?

Dean: It plays a huge role. [Dean calls up the SAFe website and clicks on “Implementing”.] Look at steps 2 and 3 of a SAFe rollout. [They are, “Train All Executives, Managers, and Leaders,” and “Train Teams and Launch Agile Release Trains,” respectively.] Let’s start with Executives. In a SAFe rollout, the process is as follows: Over on the left here [he points to “Train Lean-Agile change agents”], everyone needs to understand the principles behind SAFe. This is what the process of building that mindset looks like: We train change agents (SPCs, both internal and external) to teach “Leading SAFe,” a course that introduces managers and executives to Lean and Agile thinking. Those leaders participate in a release planning simulation and do an exercise sprint, right in the classroom. They study the Agile Manifesto. They learn about Lean and Product Development Flow. Then they learn about Agile Teams, Agile Release Trains and how to implement an agile portfolio. The last two hours is a leadership module. We finish there, because if they are not ready to lead, rather than follow, success will be limited. Then the teams are trained in SAFe Scrum and XP, and organized around value streams that can more reliably deliver value on demand.

“Is there a new mindset required to be

successful with SAFe? Yes.”

Is there a new mindset required to be successful with SAFe? Yes. Do we achieve it? Almost always. Are we dependent upon it? Absolutely. How else would companies get the results they are getting? But you can’t see mindsets in the SAFe diagram, you gotta click!

Cliff: The site is pretty rich, there is a lot of stuff. How does servant leadership fit into this?

Dean: We use “Lean-Agile Leadership” as our metaphor, which emphasizes taking a systems view, embracing the Agile Manifesto, product development flow, creating a learning organization, and enabling knowledge workers. Operating as servant leaders is part of that.

Cliff: There seems to be a misunderstanding in the industry about some core Agile concepts, such as what servant leadership is. Books I have read on servant leadership—and indeed ways in which I have experienced effective servant leadership—stipulate that servant leadership is not about totally leaving things up to the team: it is a style of leadership.

Dean: It is indeed a style of leadership. It is not passive; it’s supportive, but it still has responsibility for outcomes. Managers do not abrogate their responsibility, or indeed their authority, just because we better understand the power, and indeed humanity, of self-organization and empowerment. We must have both, leaders who lead, and teams and programs that are largely self-organizing and self-managing.

We are now are dealing with hundreds of thousands of practitioners who have embraced a new method, indeed a better way, more empowering, more fulfilling, and potentially far more effective, a new belief system. But it is a huge danger to exclude or belittle management. Perhaps this is because they haven’t been managed well in the past; and yes, we still see plenty of that. Perhaps they assume managers cannot learn new behavior; and perhaps some cannot. But we also see the opposite, an emergence of a new form of leadership. One that is based on common principles. We see that every day too, and that inspires us to keep moving forward.

That includes new ways of planning work, which SAFe provides via large scale face-to-face planning. It’s absolutely key to what we do, and we take that to a level where some have said, “Well it’s not Agile to have 100 people planning together.” But face-to-face communication is a key tenet of Agile. For example, look at the group in the photo on our website [here]; I know that group. They get as many as 175 people “together” every ten weeks, in multiple locations, and they plan simultaneously. Every ten weeks they pull together folks from the US, India, and Serbia, they bring the business owners in to participate. See that table in the middle? Those are the business executives. Every ten weeks they spend a part of two days with the teams. Frankly, it’s exhilarating. There is nothing like it, and if you read Lyssa Adkins’ article in InfoQ [here], she attended one and noted that she had never seen anything like it. She called it an “agile accelerant.”

Cliff: That was quite an inspiring article. She talks about the arc of her thinking on it, and how it changed as she experienced SAFe.

With regard to the release planning meeting, how flexible is that. Is it always a two-day session?
Dean: Two days is very standard, but it depends a bit on scope. It isn’t just planning and alignment, its a joint requirements and design session. The group in the photo takes two and a half days, because they have a lot of folks in Mumbai with a 12.5 hour time delay.

Cliff: Does it ever take longer? When I have done release planning with teams, it can take a week.

Dean: It doesn’t take that long with SAFe. You are planning only the next Program Increment; you will do it again in about 10 weeks. Take a small bite. Limit the batch size. One big international company plans across five trains at the same time—and they still do it in 2.5 days or so, but they plan together because they are interdependent. This is one aspect of SAFe that is prescriptive: you plan together every PI. Face to face communication is part of the Manifesto. You aren’t SAFe without it.

Cliff: How much preparation is needed?

Dean: It depends on whether it’s your first time or tenth. Your first time can be more challenging. Alignment has to occur in management, development, system design, operations, etc. and it might not be present prior to the meeting. Some upfront preparation is going to be required.

For example, I was recently talking to a company that had been worried about their first big room planning experience. They were afraid that the first session would be chaotic because they were obviously not fully ready—you never really are—and people were wavering in their support. What would happen if they met for two days and nothing useful came of it? Well, the CIO was a Lean-Leader. He said, “this is going to be a really critical learning experience for us. It’s just us, so what can possibly go wrong, really? Let’s get going with that first meeting.”

Because of his leadership and what took place in release planning, they now have a common way of working, they have an aligned view amongst the executives and teams, they eliminated much of the excess work in process. You cannot underestimate the value of finding and addressing program level bottlenecks, identifying otherwise hidden dependencies, finding the way to flow, and navigating competing priorities.

Cliff: How do you deal with the organizational structure issues? These organizations must have existing functions for QA and Testing and release management and so on.

Dean: The release train is usually virtual, at least initially. Just a group of the right people who agree to plan together, commit together, execute together and inspect and adapt. A business is not going to close down the business unit and merge with IT. The DevOps IT/OM group is still going to have a director who runs operations and deployment. But they have to operate as an extended team to accomplish the mission. [Dean points to this page, and “Finding the Value Stream.”].

Cliff: So it is kind of a matrix?

Dean: It is. It typically starts as a virtual organization. In some cases—the easier ones—they are already organized on lines of business.

Cliff: Is that a natural path?

Dean: In some cases. Lets say an automotive components supplier has four BUs building four product lines. Most of the business people, devs, testers, architects, etc. are all in the BU. That’s a pretty straightforward value stream, and the virtual and the physical organization are basically the same.

But if someone is implementing single sign-on across a suite of products, or trying to improve the supply chain, you’ll have to bring people in from a number of areas and different platforms. And you can’t just create a new organization for the purpose of this large-scale initiative, even if it’s long lived. So release trains are often virtual.

Cliff: How does this compare to, say, Spotify?

Dean: I’ve talked to them a bit, and I’ve followed the method. We have even discussed SAFe for scaling further. If you look at their organizational metaphor, they have Squads, which are 5-10 individuals and a product owner, pretty much the same as SAFe agile teams. They have Tribes, organized groups of Squads that deliver largely independent solution value of a type. There are up to 100 or so people in a tribe—a fairly natural social limit, a direct parallel to a SAFe Agile Release Train. As I understand it, Guilds are basically communities of practice: they advance skill development, which we don’t model in SAFe. At Agile Israel 2014, Spotify’s Head of People Operations gave a talk about their self-organizing and self-managing Squads, and he showed some of what happened initially. Lots of fast success stories, for sure. Then he showed the initial UIs for Android, IOS, Windows, etc. They looked like they were built by different teams, because of course, they were. He noted that was suboptimum for the user experience, and then described how they went back and reworked those apps with a common UX governance, so the user experience was largely the same across devices.

We are all learning the same lessons by doing Agile at scale. Is great design emergent or intentional? Both! I have a great respect for those guys. I don’t see it as a competitive method; I see it as a different set of labels for accomplishing the same thing. We absolutely support communities of practice and what they are able to accomplish, but, for the time being, they are outside the scope of SAFe. Besides, if we added them, we’d be too prescriptive :)

Cliff: Have you ever had any organizations struggle in getting SAFe going, and if so, what are some of the suggestions that you have to avoid that?

Dean: We have a very simple mantra: if you train everyone and launch Agile Release Trains, you will succeed. If execs, business owners, architects and product managers think that Agile is just a process for developers, it’s not going to work. Everybody has to understand what they are doing and what everyone’s role is. We believe that the success of the initiative is ultimately dependent, not on the framework, not on the consultants and coaches, or not even solely on the teams—it also depends on leadership. If leadership is trained in Lean-Agile thinking, people accomplish great things with SAFe.

Cliff: Is SAFe “Lean Systems Engineering” (LSE) already out, or is that coming out?

Dean: It’s well under way. A lot of the content has been developed. For example, the principles of Lean Systems Engineering are built in explicitly, whereas they are a little more hidden in SAFe. We have defined most of the core concepts, what’s a system, a system of systems, how to express systems intent without over-specifying, adaptive requirements and design, set-based development, MBSE, kanban for teams and systems work, etc. We met with about 20 systems engineers yesterday in a feedback session, and they showed us some things we need to change. It will be available to the public sometime soon, but we have not fixed the date, because we are not sure when we will reach an MVP. But I think we are over half way there.

“SAFe is designed for large-scale software solutions—

banking, financial, insurance, ISVs, etc.”

Cliff: Why would someone use SAFe versus SAFe LSE?

Dean: SAFe is designed for large-scale software solutions—banking, financial, insurance, ISVs, etc. But if you are building a satellite, where you have the satellite itself, the ground station, the web farm feeding data to the users, then that is really a system of systems, and you have to understand how the subsystems are built and interact. How one system may impose requirements on another, and how capabilities span subsystems. It’s a different problem, the systems and subsystems and their interfaces are physical and tangible, and the notion of value streams is not necessary the right abstraction at the highest level.

The large systems builders—industrial, defense, automotive, home automation, and such—come to class to learn about SAFe and how to best to apply it to their context. But they also note that “We don’t really have a Portfolio level concern here, it’s just one really big system.” We are learning from them how to model things differently. Systems, subsystems, components, capabilities, and features all play a role.

Cliff: And you have hardware-in-the-loop testing.

Dean: You still design with fast iterations and integrations. You build in small batch sizes. But you also probably have IV&V teams who may be the only ones that can put the whole thing together and test it. You have supplier subsystems and internal programs that may or may not be using Agile. You might have a customer that says “here is the system and software requirements specification, do it like it says.” You have delivery milestones for certain. You have a whole different set of constraints. It is not as free form as it is in SAFe, but you still want the benefits of a Lean and Agile approach. That’s the challenge of the modern systems builder and we think we can help with SAFe LSE.

Cliff: Is there anything that you would like to mention about what people can look forward to?

Dean: You can look forward to the continuing evolution of SAFe. The next release, SAFe 4.0, will be out this summer. It will include a number of new constructs and content elements. For instance, it will integrate kanban guidance for Teams, alongside Scrum, so SAFe teams have a clearer choice of methods, and can even combine them as they see fit.

And by the way, we are not so sure how well that news will be received in the Scrum and Kanban communities, but we think the teams deserve that choice. Let’s say you are adopting SAFe at a major systems builder, and you have a small group of 3-5 optics engineers, do they need a Scrum Master and Product Owner? Not clear. Do they need to visualize work and understand flow? Do they need to integrate with the rest of the system every two weeks? Absolutely.

But personally, I look forward to SAFe Version 8.0! That should be awesome. Some of the people in my SPC class asked me why I don’t just call version 4 version 8, but we all know that would be cheating. And of course, we have SAFe LSE 1.0 that will launch sometime this year, along with companion courseware.

Are there going to be multiple ways to scale, be it Scrum and others? Of course. Are we learning new and better ways to deliver bigger and better systems more quickly? I sure hope so. As of now, we are on Version 3.0 of SAFe, and it works. It has a large footprint of customers experiencing success, and a global community of consultants, partners, and practitioners implementing it, supporting it, and telling us how to improve it. We’ll keep listening and evolving. That’s a pretty good launching point for a next set of innovations.

Saturday, January 10, 2015

Cloud based apps are extremely vulnerable - here's what to do

And the Two Design Patterns That All Developers Should Know

15 per cent of business cloud users have been hacked.

That is according to a recent Netskope report (article here).

Recent debates about whether cloud storage are secure have focusd on the infrastructure of the cloud: that is, if your data is in the cloud, can other cloud users see it? Can the cloud provider see it?

But there has been little attention to the event more important issue of whether cloud apps themselves are secure. This is so important because if your data is in the cloud, then it is not behind your company’s firewall – it is accessible over the Internet. All one needs is the password. So if you think things were bad before, just wait until hackers shift their focus to the cloud.

Executives think that IT staff know how to

write secure software applications

– but most don’t.

Companies spend huge amounts of money trying to make their infrastructure secure, but they invest essentially nothing in making sure that their application code itself is secure. As I wrote in my book High-Assurance Design,

•    The average programmer is woefully untrained in basic principles related to reliability and security.

•    The tools available to programmers are woefully inadequate to expect that the average programmer can produce reliable and secure applications.

•    Organizations that procure applications are woefully unaware of this state of affairs, and take far too much for granted with regard to security and reliability.

The last bullet is the most important one: Executives of companies think that IT staff know how to write secure software applications – that to do otherwise would be unethical, and their staff are definitely not unethical. But this attitude is the heart of the problem, because the fact is, most software developers – and even most senior software architects – know very little about how to write secure software. Security just isn’t that interesting to most programmers: no one rewards you for writing secure code – not like you get rewarded for writing more features. And no one is asking for it, because there is an assumption – an incorrect one – that programmers create secure code in the course of their work, just as plumbers create well sealed pipes in the course of plumbing. True for plumbers, in general, but not true for programmers.

Recently I was on a DevOps team in which the client was very concerned about security. The client ran its own scans of our servers in the cloud, and found many issues that needed to be fixed. All of these issues were infrastructure related: primarily OS hardening. None had to do with the design of the application. The general feeling of the team was that the security of the application itself would not be questioned, so we did not have to worry about it. At one point, one of our databases in our cloud test environment was hacked. The database was shut down and a forensic analysis was supposedly performed (we were never told what they found). There was no impact on the team’s work – it was business as usual.

If we don’t fix this dysfunction in our industry,

then the Internet Of Things (IOT) will be a disaster.

This state of affairs is unsustainable. If we don’t fix this deep rooted dysfunction in our industry, then the Internet Of Things (IOT) will be a disaster: Imagine having every device you own connected to the Internet – to a cloud service of some kind – and all of these devices and accounts hackable. And imagine the continuous software updates to keep pace with newly discovered security vulnerabilities. This is not a future that I want – do you? Not only is George Jetson’s car a pain to maintain – with constant software updates – but it might come crashing down. People will be afraid to drive their car or use these IOT devices.

The only way to fix this is for organizations to demand that developers learn how to write secure software. You cannot scan for application level security: doing so is not effective. Having a “security officer” oversee things is not effective either – not unless that person intends to inspect every line of code written by every programmer – and that is not feasible in an Agile setting, where the code changes every day. The only way to produce secure software in an Agile environment is to for the programmers to know how to do it.

It is not that there are not lots of resources available for this, my own textbook included. There are tons of books, there are online resources – notably OWASP – and there are even certifications – and these certifications are the real deal: these are not fluff courses.

People like magic bullets. Unfortunately, there is no magic bullet for security: knowledge is the only path. But if I were asked what two things software developers should know to make their code more secure, I would have to say that they should know about these two design patterns: (1) Compartmentalization, and (2) Privilege Separation.

Your systems will be hacked. The only question is,

What will the hackers get away with?

Your systems will be hacked. There is no question about that. The only question is, What will the hackers get away with? Will they be discovered right away through intrusion detection monitoring and shut down? And if not, will they be able to retrieve an entire database of information – all of your customers’ personal data? That is, will one compromised account enable them to pull down an entire complete set of information?

Compartmentalization is an old concept: In the context of computers, it was first formalized by the Bell LaPadula model for security. It became the basis for security in early military computer systems, and it formalizes the essential concept used by the military and intelligence communities for protecting sensitive information. It is based on the concept that a person requesting access to information should have (A) sufficient trust level – i.e., they have been vetted with a defined level of thoroughness – and (B) a need to know: that is, they have a legitimate reason for accessing the information. No one – not even the most senior and trusted person – can automatically have access to everything: they must have a need to know. Thus, if someone needs information, you don’t open the whole filing cabinet: you open only those files that they have an immediate need for. To open others, you have to request permission for those.

Military computing systems are onerous to use because of the layers of security, but in a civilian setting for business applications there are ways to adopt the basic model but make parts of the process automatic. For example, restrict the amount of information that an individual can access in one request: don’t allow someone to download an entire database – regardless what level of access they have. And if they start issuing a-lot of requests – more than you would expect based on their job function – then trigger an alarm. Note that to implement this type of policy, you have to design the application accordingly: this type of security is not something that you can bolt on, because it requires designing the user’s application in such a way that they only access what they need for each transaction and are not given access to everything “in that file cabinet”.

The other key concept that programmers need to know is “privilege separation”. No one should be able to access a large set – e.g., a table – of sensitive data directly: instead, they should have to access a software service that does it for them. For example, if a user needs to examine a table to find out which rows of the table meet a set of criteria, the user should not be able to access or peruse the table directly: the user should only be able to initiate the filter action and receive the single result. The filter action is a software service that performs the required action under the privileged account of the server – which the user does not have access to. The user performs his or her work using an account that is only able to initiate the software service. If the user’s account is obtained through a phishing attack, that account cannot be used to obtain the raw data in the database: retrieving the entire table would require a huge number of calls to the service and intrusion monitoring should be watching for abnormal use such as that. This does not prevent hacking, but it greatly limits what can be lost when a hack occurs.

These measures are not sufficient, but they are a start, and they provide a foundation for how to think about application level security, from which programmers can learn more. The key is to start with an access model based on the kinds of actions that users need to perform and the subsets of data that they need direct access to for each transaction – access is not simply based on their overall level of trust or general need to access an entire class of data.

Organizations are completely to blame for the current state of affairs – and organizations can fix it.

Organizations are completely to blame for the current state of affairs: If organizations demand that programmers know how to write secure code, then programmers will respond. People are merely focusing on what their bosses are telling them is important.

So if you are an executive in an IT organization, it is up to you. The industry will not fix things: You need to make security a priority. You need to tell your teams that you expect them to learn how to write secure code. You need to create incentives for programmers and software architects to become knowledgeable and even certified in secure coding. You need to create a culture that values security. Security is up to you.

Saturday, January 3, 2015

Real Agile Testing, In Large Organizations – Part 4

(Continued from Part 3)

Is everyone a tester?

This is one of the greatest debates about Agile testing: who does it? One camp claims that everyone on an Agile team is a tester: there should be no “tester” role. The Scrum camp is perhaps most adamant about this. Another camp claims that there are testers, and that the separate role is very important.

Again, the right answer depends. Even Jeff Sutherland – the inventor of Scrum – has complimented the performance of projects that had separate test teams. This one stands out. So if it is ok with Dr. Sutherland, it should be ok for Scrum adherents. The question is, when does it make sense, and when does it not make sense?

For Morticia’s website (see Part 1), Thing did most of the testing but we all chipped in, and that was fine. But for the EFT Management Portal, the testing is so complex that it would sure make sense to have a test lead, and several people to focus on pulling together the performance testing, the security testing strategy, the testing of the multiple back end partner interfaces, the testing of the legal compliance rules, and so on. These things are like projects in their own right and so they need leads. But don’t forget about learning: some people might want to learn about new types of testing and test automation, even if they have not done it before, so allow team members to change roles with appropriate supervision (e.g., through pairing).

Saying “You should never have a test lead” is not very Agile,

and saying “You should always have a test lead”

is not very Agile either.

If unsure, use common sense: don’t use doctrine. Agile is first and foremost about applying judgment: that is why the Agile Manifesto is written the way it is, with phrases like “While there is value in the items on the right, we value the items on the left more.” In other words, it is not prescriptive. They wanted us to keep our thinking flexible: e.g., saying “You should never have a test lead” is not very Agile, and saying “You should always have a test lead” is not very Agile either.

Who needs to understand these things

One of the most important aspects of Agile in general is the conversations among the team – the exchange of ideas, the helping each other, and the talking through of issues. This is critically important as it relates to testing and understanding when we have met the needs of the end users. Ensuring that the team understands – not just reads – the testing strategy, is critical, so that the testing strategies are ever present in the minds of the developers. Developers need to think about testability as they design and code, and they need to be thinking about failure modes and how those are going to be tested, because developers often identify situations that that testers have overlooked. Communicating testing concerns across team roles is extremely important.

In Part 1 of this article we pointed out that an Agile test strategy is developed and maintained by the team, in collaboration with external stakeholders. The team leader(s) (e.g., Scrum Master, coach, project manager, tech lead, etc. – however the team(s) is/are constituted) need(s) to understand what an Agile test strategy is for, so that it is accounted for during iteration planning. If the organization has support functions such as Security, Testing, Architecture, etc., those support functions need to understand how an Agile test strategy is different from a traditional test plan: the collaborative nature of Agile testing, the need to test continually, the need to automate as much as possible, and the need to evolve the testing strategies as the team learns more about the application and its requirements. The support function managers need to know all this so that they can make sure that they provide staff to collaborate with the team in the initial development of the testing strategies and throughout software development.

The support functions will need to come to terms with the fact that their role changes significantly with respect to waterfall development: waterfall teams obtain services from support functions, services such as Testing, Architecture, etc., and those services operate largely independently. In an Agile setting, the support functions need to operate in a collaborative way, working side by side with the team. In fact, much of their work should shift from “doing” to “teaching” – i.e., the support functions need to coach the team in how to perform the things that the support function used to do – to the extent that that is practical. Thus, support functions become coaching centers and resource centers. (In the article How To Rapidly Infuse Technical Practices Into Your Agile Teams we talk about how to transition waterfall oriented support functions to Agile support functions.)

Agile coaches need to work with the various support functions to help them to think through these changes. Agile will impact the types of people who work in the various support functions – they need to be more people-oriented, with an interest in helping others instead of doing the work themselves. The suppot function staff will also have to learn about Agile practices and automation tools. Agile will impact how the support functions are measured by senior management: they will need to be measured on how effectively they help teams to become self sufficient in technical practices, and the support functions also need to be measured in terms of whether they stay current in the rapidly evolving landscape of Agile tools. Given these changes, Agile will therefore impact funding for these functions. It will shift the balance of power in the organization, and that is why the CIO needs to be the driver for these discussions. In a successful Agile transformation, the support functions are not eliminated: they are transformed and reorganized. Knowledge must increase – not decrease – and to make continual learning a sustainable practice, it really helps to have organizational functions that focus on helping practitioners – the teams – to continue to learn new things in an endless cycle of learning, doing, and improving.

Creating a learning organization

In a discussion thread in the LinkedIn group Agile and Lean Software Development, Claes Jonsson, a Continuous Deployment architect at TPG Objektfabriken in Sweden, asked this pertinent question:

How is [assurance achieved] in an organization that is committed to delivering the right thing, with extremely high quality, minimal waste and with the shortest possible time to market using Lean Startup principles and Continuous Release practices? And do note that this does NOT mean unstructured, or disorganized, it instead relies on high organizational alignment, and extreme discipline.

The only way for an organization to preserve assurance – that is, manage risk – while becoming more Agile is to elevate people's knowledge. E.g., consider security: the process of having "gates" for security review is antithetical to Agile because gates impose risk management at discrete points instead of integrating it into the development process itself. But if you teach developers how to write secure code, then you don't need the gates anymore! The same thing applies to other areas of assurance.

But wait: I am using a little bit of hyperbole here: gates are a form of oversight, and it is not really that you don’t need any kind of oversight – you do – but it takes a different form. In Part 2 of this article we talked about how the concept of test coverage, and the role that a quality assurance (QA) function might play in an Agile setting. We explained that an Agile form of QA is still independent, but that it works alongside a development team – not as a gated phase. Again, Morticia’s website probably doesn’t need such a setup, but the EFT management portal probably does: you need to make a judgment about how much independent quality oversight is necessary to properly manage all of the risks in your project or program. Agile is about transparently and collaboratively making those kinds of judgments – not blindly following a plan or procedure.

Many large organizations utilize gated software development processes due to historical reasons. These “gates” typically include a review phase for things such as security and regulatory compliance adherence. What we find when working with these organizations – and this is different from small companies and startups – is that the gates are used in lieu of conversations about how the software will meet compliance requirements: i.e., the transparent and collaborative discussions about what process to use for the project do not take place. By shifting to a culture of learning through conversations, gated processes can eventually be reduced to a few minimal stages or eliminated entirely.

There is an artificial comfort in gated processes.

There is an artificial comfort in gated processes: one feels secure because the gates are in place, but the comfort is naïve because the gates do not address the underlying reason the gates were created in the first place: that those who are building systems either do not know how to implement the compliance and risk management requirements, or they are not testing sufficiently for these things. Learning organizations move past this dilemma by ensuring that there is a much broader understanding of the requirements and how to test for them.

Agile transformation is really about systematically creating a learning organization. You have to identify the things that people need to know, and make sure that there are people who know those things embedded in the development process, by creating a system for people to learn and share that knowledge (here is one approach). Ideally, everyone knows everything, but that is not practical, so there is a balance that needs to be achieved between specialists and generalists. But all need to be involved in real time or near real time.

The chart at the end of this article lists some of the things that each part of the IT organization will need to learn. As you can see, it is a-lot, and that is why learning – not process re-engineering – is the “long pole in the tent” for Agile transformation. Learning is one of the first steps on the long road of changing to an Agile culture.

Conclusions

Morticia’s website and the EFT Management Portal are two extremes – as business systems go. Most business applications are somewhere inbetween. That is why there is no single answer to how one should plan and execute testing under Agile.

Trust the team – that works fine for Morticia’s website. Top down planning – quite a bit of that is needed for the EFT Management Portal, although we try to approach it in an Agile way by putting the team in the driver’s seat, by keeping the documentation light, by allowing things to evolve, to a degree, by doing testing repeatedly and with as much automation as possible, and by implementing a learning strategy built around coaching to help teams to learn what they need to know to address all of the things that need to be tested.

In the case of the EFT Management Portal we also found that external parties demand – have a right to – oversight, and their risk management team will want to talk to our development leaders – hence we need to have development team leaders if only for the purpose of interfacing to these risk management folks – and the risk management folks will want to review our testing strategies and the ways that we measure test coverage. They will also be watching closely when our first release is deployed for demonstration: by that time, we should have already tested the application at scale in our cloud based test environment and so we should know what the outcome will be – there should be no surprises – but the first release is still a major visible milestone, even if it is not a production release, and people’s credibility rides on it to a large degree.

The fact that credibility rides on a first release is cultural – but it is also human, and to a large extent independent of culture – and so even though Agile encourages tolerance for failure, that has its limits: the idea is to fail early to learn so that you do not fail when it counts and everyone is watching – including the risk management people who are tasked with finding out if you know what you are doing. Testing is Agile’s primary tool for ensuring early failure (feedback), so it is crucial to do the early planning needed to make sure that testing is thorough.

One sign that your organization is embracing early contained failure as a strategy for ensuring long term success is when teams start using demonstrations as a way of “proving” that compliance requirements are met. Teams often look forward to being able to do this. This then continues to strengthen trust within the organization. The more an organization implicitly trusts its teams, the less process rigor is needed – but this will only occur if continual learning is implemented as a strategic way of ensuring that the teams have and continue to have the knowledge that they need.

Authors (alphabetically):
Scott Barnes
Cliff Berg

As PDF.