Testing Blog
This Code is CRAP
Tuesday, February 22, 2011
Note: This post is rated PG-13 for use of a mild expletive. If you are likely to be offended by the repeated use a word commonly heard in elementary school playgrounds, please don’t read any further.
CRAP is short for Change Risk Anti-Patterns – a mildly offensive acronym to protect you from deeply offensive code. CRAP was originally developed and launched in 2007 by yours truly (Alberto Savoia) and my colleague and partner in crime Bob Evans.
Why call it CRAP? When a developer or tester has to work with someone else’s (bad) code, they rarely comment on it by saying things like: “The median cyclomatic complexity is unacceptable,” or “The efferent coupling values are too high.” Instead of citing a particular objective metric, they summarize their subjective evaluation and say things like: “This code is crap!” At least those are the words the more polite developers use; I’ve heard and read far more colorful adjectives and descriptions over the years. So Bob and I decided to coin an acronym that, in addition to being memorable – even if it’s for the wrong reasons, is a good match with the language that its intended users use and it’s guaranteed to grab a developer’s attention: “Hey, your code is CRAP!”
But what makes a particular piece of code CRAP? There is, of course, no fool-proof, 100% objective, and accurate way to determine CRAPpiness. However, our experience and intuition – backed by a bit of research and a lot of empirical evidence – suggested the possibility that there are detectable and measurable patterns that indicate the possible presence of CRAPpy code. That was enough to get us going with the first anti-pattern (which I’ll describe shortly.)
Since its inception, the original version of CRAP has gained quite a following; it has been ported to various languages and platforms (e.g. Java, .NET, Ruby, PHP, Maven, Ant) and it’s showing up both in free and commercial code analysis tools such as Hudson’s Cobertura and Atlassian’s Clover. Do a Google search for “CRAP code metric” and you’ll see quite a bit of activity. All of which is making Bob and I feel mighty proud, but we haven’t been resting on our laurels. Well, actually we have done precisely that. After our initial work (which included the Crap4J Eclipse plug-in and the, now mostly abandoned, crap4j.org website) we both went to work for Google and got busy with other projects. However, the success and adoption of CRAP is a good indication that we were on to something and I believe it’s time to invest a bit more in it and move it forward.
Over the next few weeks I will post about the past, present and future of CRAP. By the time I’m done, you will have the tools to:
- Know you CRAP
- Cut the CRAP, and
- Don’t take CRAP from nobody!
I’ll finish today’s entry with a bit of background on the original CRAP metric.
A Brief History of CRAP
As the CRAP acronym suggests, there are several possible patterns that make a piece of code CRAPpy, but we had to start somewhere. Here is the first version of the (in)famous formula to help detect CRAPpy Java methods. Let’s call it CRAP1, to make clear that this covers just one of the many interesting anti-patterns and that there are more to come.
CRAP1(m) = comp(m)^2 * (1 – cov(m)/100)^3 + comp(m)
Where CRAP1(m) is the CRAP1 score for a method m, comp(m) is the cyclomatic complexity of m, and cov(m) is the basis path code coverage from automated tests for m.
If CRAP1(m) > 30, we consider the method to be CRAPpy.
This CRAP1 formula did not materialize out of thin air. We arrived at this particular function empirically; it’s the result of a best fit curve achieved through a lot of trial-and-error. At the time we had access to the source code for a large number of open source and commercial Java projects, along with their associated JUnit tests. This allowed us to rank code for CRAPpiness using one formula, ask our colleagues if they agreed and kept iterating until we reached diminishing returns. This way we were able to come up with a curve that was a pretty good fit for the more subjective data we got from our colleagues.
Here’s why we think that CRAP1 is a good anti-pattern to detect. Writing automated tests (e.g., using JUnit) for complex and convoluted code is particularly challenging, so crappy code usually comes with few, if any, automated tests. This means that the presence of automated tests implies not only some degree of testability (which in turn seems to be associated with better, or more thoughtful, design), but it also means that the developers cared enough, knew enough and had enough time to write tests – which is another good sign for the people inheriting the code. These sounded like reasonable assumptions at the time, and the adoption of CRAP1 – especially by the Agile community – reflects that.
Like all software metrics, CRAP1 is neither perfect nor complete. We know very well, for example, that you can have great code coverage and lousy tests. In addition, sometimes complex code is either unavoidable or preferable; there might be instances where a single higher complexity method might be easier to understand than three simpler ones. We are also aware that the CRAP1 formula doesn’t currently take into account higher-order, more design-oriented metrics that are relevant to maintainability (such as cohesion and coupling) – but it’s a start, the plan is to add more anti-patterns.
Use CRAP On Your Project
Even though Bob and I haven't actively developed or maintained Crap4J in the past few years (shame on us!), other brave developers have been busy porting CRAP to all sorts of languages and environments. As a result, there are many versions of the CRAP metric in open source and commercial tools. If you want to try CRAP on your project, the best thing to do is to run a Google search for the language and tools you are currently using.
For example, a search for "crap metric .net" returned several projects, including
crap4n
and one called
crap4net
. If you use Clover, here's how you can
use it to implement CRAP
. PHP? No problem, someone implemented
CRAP for PHPUnit
. However, apparently nobody has implemented CRAP for COBOL yet ... here's your big chance!
Until the next blog on CRAP, you might enjoy this
vintage video on Crap4J.
Please note, however, that the Eclipse plug-in shown in the demo does not work with versions of Eclipse newer than 3.3 - we did say it was a
vintage
video and that Bob and I have been resting on our laurels!
Posted by Alberto Savoia
How Google Tests Software - A Brief Interlude
Tuesday, February 22, 2011
By James Whittaker
These posts have garnered a number of interesting comments. I want to address two of the negative ones in this post. Both are of the same general opinion that I am abandoning testers and that Google is not a nice place to ply this trade. I am puzzled by these comments because nothing could be further from the truth. One such negative comment I can take as a one-off but two smart people (hey they are reading this blog, right?) having this impression requires a rebuttal. Here are the comments:
"A sad day for testers around the world. Our own spokesman has turned his back on us. What happened to 'devs can't test'?" by Gengodo
"I am a test engineer and Google has been one of my dream companies. Reading your blog I feel that Testers are so unimportant at Google and can be easily laid off. It's sad." by Maggi
First of all, I don't know of any tester or developer for that matter being laid off from Google. We're hiring at a rapid pace right now. However, we do change projects a lot so perhaps you read 'taken off a project' to mean something far worse than the reality of just moving to another project. A tester here may move every couple of years or so and it is a badge of honor to get to the point where you've worked yourself out of a job by building robust test frameworks for others to contribute tests to or to pass off what you've done to a junior tester and move on to a bigger challenge. Maggi, please keep the dream alive. If Google was a hostile place for testers, I would be working somewhere else.
Second, I am going to dodge the negative undertones of the developer vs tester debate. Whether developers can test or testers can code seems downright combative. Both types of engineers share the common goal of shipping a product that will be successful. There is enough negativity in this world and testers hating developers seems so 2001.
In fact, I feel a confession coming on. I have had sharp words with developers in the past. I have publicly decried the lack of testing rigor in commercial products. If you've seen me present you've probably witnessed me showing colorful bugs, pointing to the screen and shouting "you missed a spot!" I will admit, that was fun.
Here are some other quotes I have directed at developers:
"You must be smarter than me because I couldn't write this bug if I was trying to."
"What happened, did the compiler get your eye?"
"What do you say to a developer with two black eyes? Nothing, he's already been told twice."
"Did you hear about the developer who locked himself in his car?"
Ah, those were the good old days! But it's 2011 now and I am objective enough to give developers credit when they step up to the plate and do their job. At Google, many have and they are helping to shame the rest into following suit. And this is making bugs harder to find. I waste so little time on low hanging fruit that I get to dig deeper to find the really subtle, really critical bugs. The signal to noise ratio is just a whole lot stronger now. Yes there are fewer developer jokes but this is progress. I have to make myself feel good knowing how many bugs have been prevented instead of how many laughs I can get on stage demonstrating their miserable failures.
This is progress.
And, incidentally developers can test. In some cases far better than testers. Modern testing is about optimizing the places where developers test and where testers test. Getting that mix right means a great product. Getting it wrong puts us back in 2001 where my presentations were a heck of a lot funnier.
In what cases are developers better testers that we are? In what cases are they not only poor testers but we're better off not having them touch the product at all? Well, that's the subject of my next couple of posts. In the meantime...
...Peace.
Who reads this blog?
Thursday, February 17, 2011
By Patrick Copeland
Just considering last year...
This blog was read in 181 countries/territories.
20% of visitors came to the site at least 4 times.
About 50% of the visits came from the United States, India, United Kingdom, Brazil, Canada, and Germany.
Within the US, all states are represented, with a majority of visits coming from the significant technology centers.
Top 10 world wide cities visiting (outside of the Bay Area): London, Bangalore, New York, Sao Paulo, Chennai, Hyderabad, Tokyo, Redmond, Seoul, Moscow.
The average visitors stay about two minutes (enough time to read the post). Although, for some reason people in Switzerland stayed for 12 minutes on average and looked at twice as many pages.
We get numerous visits from Central Asia and Melanesia, but the time on site is very small, which indicates from use of bots. As a matter of fact, 30% of the visits are flagged as search engine traffic.
The
highest read single post
was written by Alberto Savoia with 39,778 visits. Followed in (a distant) second place by a
post I wrote
. BTW, James'
recent posts
are really catching fire and I think Alberto could be unseated in 2011.
Below is the view of the traffic by continent...
Thanks for your visits and we welcome your comments.
How Google Tests Software - Part Three
Wednesday, February 16, 2011
By James Whittaker
Lots of questions in the comments to the last two posts. I am not ignoring them. Hopefully many of them will be answered here and in following posts. I am just getting started on this topic.
At Google, quality is not equal to test. Yes I am sure that is true elsewhere too. “Quality cannot be tested in” is so cliché it has to be true. From automobiles to software if it isn’t built right in the first place then it is never going to be right. Ask any car company that has ever had to do a mass recall how expensive it is to bolt on quality after-the-fact.
However, this is neither as simple nor as accurate as it sounds. While it is true that quality cannot be tested in, it is equally evident that without testing it is impossible to develop anything of quality. How does one decide if what you built is high quality without testing it?
The simple solution to this conundrum is to stop treating development and test as separate disciplines. Testing and development go hand in hand. Code a little and test what you built. Then code some more and test some more. Better yet, plan the tests while you code or even before. Test isn’t a separate practice, it’s part and parcel of the development process itself. Quality is not equal to test; it is achieved by putting development and testing into a blender and mixing them until one is indistinguishable from the other.
At Google this is exactly our goal: to merge development and testing so that you cannot do one without the other. Build a little and then test it. Build some more and test some more. The key here is who is doing the testing. Since the number of actual dedicated testers at Google is so disproportionately low, the only possible answer has to be the developer. Who better to do all that testing than the people doing the actual coding? Who better to find the bug than the person who wrote it? Who is more incentivized to avoid writing the bug in the first place? The reason Google can get by with so few dedicated testers is because developers own quality. In fact, teams that insist on having a large testing presence are generally assumed to be doing something wrong. Having too large a test team is a very strong sign that the code/test mix is out of balance. Adding more testers is not going to solve anything.
This means that quality is more an act of prevention than it is detection. Quality is a development issue, not a testing issue. To the extent that we are able to embed testing practice inside development, we have created a process that is hyper incremental where mistakes can be rolled back if any one increment turns out to be too buggy. We’ve not only prevented a lot of customer issues, we have greatly reduced the number of testers necessary to ensure the absence of recall-class bugs. At Google, testing is aimed at determining how well this prevention method is working. TEs are constantly on the lookout for evidence that the SWE-SET combination of bug writers/preventers are screwed toward the latter and TEs raise alarms when that process seems out of whack.
Manifestations of this blending of development and testing are all over the place from code review notes asking ‘where are your tests?’ to posters in the bathrooms reminding developers about best testing practices, our infamous Testing On The Toilet guides. Testing must be an unavoidable aspect of development and the marriage of development and testing is where quality is achieved. SWEs are testers, SETs are testers and TEs are testers.
If your organization is also doing this blending, please share your successes and challenges with the rest of us. If not, then here is a change you can help your organization make: get developers fully vested in the quality equation. You know the old saying that chickens are happy to contribute to a bacon and egg breakfast but the pig is fully committed? Well, it's true...go oink at one of your developer and see if they oink back. If they start clucking, you have a problem.
How Google Tests Software - Part Two
Wednesday, February 9, 2011
By James Whittaker
In order for the “you build it, you break it” motto to be real, there are roles beyond the traditional developer that are necessary. Specifically, engineering roles that enable developers to do testing efficiently and effectively have to exist. At Google we have created roles in which some engineers are responsible for making others more productive. These engineers often identify themselves as testers but their actual mission is one of productivity. They exist to make developers more productive and quality is a large part of that productivity. Here's a summary of those roles:
The
SWE
or
Software Engineer
is the traditional developer role. SWEs write functional code that ships to users. They create design documentation, design data structures and overall architecture and spend the vast majority of their time writing and reviewing code. SWEs write a lot of test code including test driven design, unit tests and, as we explain in future posts, participate in the construction of small, medium and large tests. SWEs own quality for everything they touch whether they wrote it, fixed it or modified it.
The
SET
or
Software Engineer in Test
is also a developer role except their focus is on testability. They review designs and look closely at code quality and risk. They refactor code to make it more testable. SETs write unit testing frameworks and automation. They are a partner in the SWE code base but are more concerned with increasing quality and test coverage than adding new features or increasing performance.
The
TE
or
Test Engineer
is the exact reverse of the SET. It is a a role that puts testing first and development second. Many Google TEs spend a good deal of their time writing code in the form of automation scripts and code that drives usage scenarios and even mimics a user. They also organize the testing work of SWEs and SETs, interpret test results and drive test execution, particular in the late stages of a project as the push toward release intensifies. TEs are product experts, quality advisers and analyzers of risk.
From a quality standpoint, SWEs own features and the quality of those features in isolation. They are responsible for fault tolerant designs, failure recovery, TDD, unit tests and in working with the SET to write tests that exercise the code for their feature.
SETs are developers that provide testing features. A framework that can isolate newly developed code by simulating its dependencies with stubs, mocks and fakes and submit queues for managing code check-ins. In other words, SETs write code that allows SWEs to test their features. Much of the actual testing is performed by the SWEs, SETs are there to ensure that features are testable and that the SWEs are actively involved in writing test cases.
Clearly SETs primary focus is on the developer. Individual feature quality is the target and enabling developers to easily test the code they write is the primary focus of the SET. This development focus leaves one large hole which I am sure is already evident to the reader: what about the user?
User focused testing is the job of the Google TE. Assuming that the SWEs and SETs performed module and feature level testing adequately, the next task is to understand how well this collection of executable code and data works together to satisfy the needs of the user. TEs act as a double-check on the diligence of the developers. Any obvious bugs are an indication that early cycle developer testing was inadequate or sloppy. When such bugs are rare, TEs can turn to their primary task of ensuring that the software runs common user scenarios, is performant and secure, is internationalized and so forth. TEs perform a lot of testing and test coordination tasks among TEs, contract testers, crowd sourced testers, dog fooders, beta users, early adopters. They communicate among all parties the risks inherent in the basic design, feature complexity and failure avoidance methods. Once TEs get engaged, there is no end to their mission.
Ok, now that the roles are better understood, I'll dig into more details on how we choreograph the work items among them. Until next time...thanks for your interest.
Labels
Aaron Jacobs
1
Adam Porter
1
Alan Faulkner
1
Alan Myrvold
1
Alberto Savoia
4
Alek Icev
2
Alex Eagle
1
Allen Hutchison
6
Andrew Trenk
8
Android
1
Anthony Vallone
25
Antoine Picard
1
APIs
2
App Engine
1
April Fools
2
Arif Sukoco
1
Bruce Leban
1
C++
11
Chaitali Narla
2
Christopher Semturs
1
Chrome
3
Chrome OS
2
Dave Chen
1
Diego Salas
2
Dmitry Vyukov
1
Dori Reuveni
1
Eduardo Bravo Ortiz
1
Ekaterina Kamenskaya
1
Erik Kuefler
3
Espresso
1
George Pirocanac
2
Google+
1
Goranka Bjedov
1
GTAC
54
Hank Duan
1
Harry Robinson
5
Havard Rast Blok
1
Hongfei Ding
1
James Whittaker
42
Jason Arbon
2
Jason Elbaum
1
Jason Huggins
1
Java
5
JavaScript
7
Jay Han
1
Jessica Tomechak
1
Jim Reardon
1
Jobs
14
Joe Allan Muharsky
1
Joel Hynoski
1
John Penix
1
John Thomas
3
Jonathan Rockway
1
Jonathan Velasquez
1
Julian Harty
5
Julie Ralph
1
Karin Lundberg
1
Kaue Silveira
1
Kevin Graney
1
Kirkland
1
Kurt Alfred Kluever
1
Lesley Katzen
1
Marc Kaplan
3
Mark Ivey
1
Mark Striebeck
1
Marko Ivanković
1
Markus Clermont
3
Michael Bachman
1
Michael Klepikov
1
Mike Wacker
1
Misko Hevery
32
Mobile
2
Mona El Mahdy
1
Noel Yap
1
Patricia Legaspi
1
Patrick Copeland
23
Patrik Höglund
5
Peter Arrenbrecht
1
Phil Rollet
1
Philip Zembrod
4
Pooja Gupta
1
Radoslav Vasilev
1
Rajat Dewan
1
Rajat Jain
1
Rich Martin
1
Richard Bustamante
1
Roshan Sembacuttiaratchy
1
Ruslan Khamitov
1
Sean Jordan
1
Sharon Zhou
1
Shyam Seshadri
4
Simon Stewart
2
Stephen Ng
1
Tejas Shah
1
Test Analytics
1
Tony Voellm
2
TotT
54
Vojta Jína
1
WebRTC
2
Yvette Nameth
2
Zhanyong Wan
6
Zuri Kemp
2
Archive
2015
December
November
October
August
June
May
April
March
February
January
2014
December
November
October
September
August
July
June
May
April
March
February
January
2013
December
November
October
August
July
June
May
April
March
January
2012
December
November
October
September
August
2011
November
October
September
August
July
June
May
April
March
February
This Code is CRAP
How Google Tests Software - A Brief Interlude
Who reads this blog?
How Google Tests Software - Part Three
How Google Tests Software - Part Two
January
2010
December
November
October
September
August
July
June
May
April
March
February
January
2009
December
November
October
September
August
July
June
May
April
February
January
2008
December
November
October
September
August
July
June
May
April
March
February
January
2007
October
September
August
July
June
May
April
March
February
January
Feed
Follow @googletesting