Proposal
Name of Tutorial
Reading Tea Leaves: Demystifying Your Metrics
Intent
Metrics can make software development a rich, reflective, cognizant adventure. On the other hand, code metrics can also be misused to bludgeon and abuse the team. In the tutorial, you will learn how to successfully employ metrics with an agile attitude. Additionally, you will find out how to avoid getting exactly what you measure for. Metrics can help you and your team stay on the path of continuous agile improvement.
Summary
So you think metrics are going to save your project? In this tutorial, you will learn to distinguish pragmatic advice from mystic measures. You will run various code analysis tools, which will:
- Raise the refactor alarm
- Restructure technical debt
- Archive project trends
- Illuminate the big picture
- Spice up retrospectives
- Establish a team lore
You will see interpretation of metrics and trends for a number of multi-year projects with various degrees of "agility". If possible, bring your laptop with your favorite source code... and prepare yourself for surprising revelations.
Audience
- Those who love metrics
- Those who hate metrics
- Those who can't decide how they should feel about metrics
Outline
The following is the presentation outline for the tutorial. At times during the presentation, hands-on exercises will be done to reinforce the concepts being learned. For example, attendees may be asked to use a metrics-gathering program to search for the longest function in their favorite source code, then asked to explain how it can be refactored.
We will continuously be evaluating the metrics discussed in terms of how they: Raise the refactor alarm, restructure technical debt, archive project trends, illuminate the big picture, spice up retrospectives and establish a team lore.
- Introduction to metrics
- Create lists of metrics you have used
- Useful
- Abusive
- Categorization of metrics
- Percentages
- Rankings
- Trends
- Alerts
- Line Counts
- Methods
- Files
- Project
- Version Control Statistics
- Build System Statistics
- Code coverage
- Emma
- Commercial tools
- Testing
- Test counts
- Failing test counts
- Duplication Detection
- PMD/CPD
- Custom
- Other Details
- Cyclomatic Complexity
- Longest function
- Average function length
- Blocks of dead code
- Uncalled methods
- Conditionals
- Imports/Include Dependencies
- TODO comments
- 30 second wrap-up
- Conclusion
Material to be distributed
- Lists of commercial and open-source metrics tools with their applicability.
- Source code to the metrics tool created by the presenters
- Slides from presentation
- Examples of historical metrics in real-world projects
About the organizers
Zhon and Jeff have been gathering practical source code metrics on green-field and legacy enterprise software projects with various degrees of agility for 4 years (before it was hip to be agile), and have authored their own open-source metrics program in Ruby.
Jeff Grover has worked in the software industry for 13 years and is currently a Principal Engineer for Symantec's enterprise security products. He has participated in 'Extreme Fishbowl' presented at XP/Agile Universe, UJUG, and XPUtah. He represented Symantec on the Agile Experiences panel of XP/Agile Universe 2002.
Zhon Johansen co-founded Acadyn, a company helping small businesses with IT and information security needs. He has studied, practiced, and taught Extreme Programming since early 1999. He also co-founded XPUtah in December of 2001. He organized and presented 'Extreme Fishbowl' to XP/Agile Universe, UJUG, and other conferences. He also helped coach 'XP for a Day' presented at XP/Agile Universe.
Zhon & Jeff both presented "Making money with (or without) software" at the 2004 Agile Development Conference.
NOTES
Anger - emotion Without metrics, agile methods are just hacking... Big commercial tool...
Naming the metrics tool
MeToo AMeTo
Details about metrics
The code metrics we keep can help us in a number of ways:
To improve the quality of our code (avoid QualityDebt)
- To target refactorings
- To promote the addition of tests to untested code
- To suggest some XP practices which may need improvement
To keep our code building & testing cross-platform (Linux, at least)
- To flag when our unit tests or the build are left broken overnight
- To detect the addition of TLC (Truly Lousy Code) to our project
- To maintain a history of these things as our code evolves
Details Links
Zhon and Jeff's metrics tool produces a very long and not-so-well formatted raw text output. Mostly, it is useful because it contains the top 100 "worst offenders" in each of the metrics categories below which apply, although you may have to search a bit to find what you're looking for.
- Unused Import Details - Shows you which *.java files have unused "import" statements, and recommends a set of imports to replace them with.
- Compile Details - Contains the raw output of the -debug "ant" build for the project, you can use this to debug the build for Linux, should there be any compile problems or the like.
Break-down of Individual Metrics
Lines of Code - There's not much to say about this metric except that it reveals quantity, not quality. The count will probably go up as the project evolves, but when it goes down, it GivesGoosebumpsToTheArchitect. Code quantity is a liability, not an asset (functionality is the asset).
Longest Function - This is primarily a pointer to a refactor target. The refactor in question here is ExtractMethod. Long functions are difficult to read and debug, and usually violate the SingleResponsibilityPrinciple.
Average Function Length - This is like the above metric, but an average over all the files, and as such is a fairly good indicator of how well we are obeying the SingleResponsibilityPrinciple and/or how often we perform the ExtractMethod refactor.
Percent Duplicated Lines - This is a measure of how well we obey the OnceAndOnlyOnce or DryPrinciple. It shows what percent of lines are duplicated (exactly, although we'd like to change it to approximately using an EditDistanceAlgorithm or the like) across the project. If this goes up, it may indicate CopyAndPasteCoding.
Highest Percent Internal File Duplication - This metric also shows a current target for the ExtractMethod refactor, though for a different reason. (Taking duplicate pieces of code and making one method which is called by the methods where the code is duplicated). In this case, the refactor could be ConsolidateDuplicateConditionalFragments.
- Total Blocks of Dead Code - This just shows where we have a bunch of "commented-out" code that's not being used. It may indicate a refactor that should have happened, a potential bug or something that needs attention. In most cases, it's just "screen clutter" that can be removed.
- Number of Uncalled Functions/Methods - Flags functions or methods that don't appear to be used anywhere else in the project. This is worse than the "Dead Code", above in that it costs compile time / memory space and could actually cause bugs if called by unwitting programmers. Many of these may be "red herrings" because they are "callbacks", called from external frameworks/libraries, etc.
- Total Unit Tests - This metric graph should always be growing/going up (unlike most of the other metrics), with a slope somewhat similar (or steeper) than the "lines of code" metric. If this becomes flat, it's usually a bad sign and may indicate untested code.
- Failing Unit Tests - This should never be anything but zero (0). If it is, then someone has checked in code that broke a test and left it overnight... they'll be buying doughnuts for the team in the morning :-).
- Unit Test Errors - This may indicate someone has broken the code (as above), but more likely that the code doesn't work cross-platform (i.e., Linux where the metrics are run) or a configuration or build problem.
Most "switch" Conditions - Switches are an occasionally "necessary evil" for object-oriented programmers. Having more than one per method/function is certainly a bad thing, as it indicates unnecessary conditional complexity, and ExtractMethod would be prescribed. This metric should never be more than 1.
Most "case" Conditions - Related to the above metric, this measures how many individual "cases" are in a method/function. This is just another way to flag methods with lots of conditionals. Keeping this metric in the single digits GivesGoosebumpsToTheArchitect.
Most "if/else" Conditions - Like the above two, this metric shows excessive conditional logic which can usually be removed with ExtractMethod. A very low number GivesGoosebumpsToTheArchitect, guaranteed.
Unused "import" Statements - On the surface, this metric just seems to point out a minor annoyance which, if fixed will probably speed up the compile slightly and ease moving code around a bit. But in a sneaky way, it can demonstrate that refactoring is happening (or not), and thus whether you are slipping into QualityDebt. When developers move methods or classes around, they seldom bother to remove these unneeded imports, and so little jumps in this graph can be a good thing... meaning refactorings are happening. Cleaning up the unwanted imports is okay too, so a "sawtooth" graph is desirable.
- Untested Source Files - This is, very simply, a ratio of source files to test files (times 100). It is a very rough way to measure test coverage. The preferred (more accurate and detailed) way found in the Coverage Details link on the page (description below).
- TODO Comments - All those little things we leave undone... my how they add up!
Other Metrics
Most Often Changed Files - According to KentBeck?, the files you change the most often should be the most reliable, stable, and have the most tests. Thus far, it appears that we have struggled with this principle.
Coverage Details - This report generated by the JProbeCoverageTool indicates which parts of our code are covered by unit tests (in the "AllTests" suite) and which are not. If you're in the mood to write some unit tests, this is the place to start to show you where you can get the most coverage bang for your testing buck.
Other Tools
TogetherJ Audit - This has some great suggestions that we should strive to incorporate into our daily JavaProgrammingParadigms. General suggestions for safety, optimization, security, clarity, etc. of Java coding.
- Agitar Management Dashboard
- Cruise Control
- PMD
- CPD
http://javacentral.compuware.com/products/optimaladvisor/index.shtml ("Download Free Edition")
http://blogs.msdn.com/mswanson/articles/154460.aspx - using cyclomatic complexity to improve the code
http://www.developertesting.com/archives/month200502/20050218-IsItWiseToAimForNTF.html - getting what you measure for
Useful Technologies
http://www.nist.gov/dads/HTML/dynamicprog.html - Dynamic Programming to find duplication
Here are some ideas for future verisons of metrics program
- checkins per day to source control
- test file ratio (gnuplot)
- compile details / warnings
- critical dependencies parameter (which dependencies are critical?)
- split detailed reports into multiple
- make HTML output instead of text
- conditional logic details
- c and java lint
- conditionals per func - if, else, case
- bad package depend. In java, cycles
- longest switch statements or any if, loop, catch, etc. CODE BLOCK (lines, not cases)
- unused classes
- unneeded imports
- stable abstractions principle (where is stability needed? are they interfaces?)
- Interface pollution, segregation - unused/unrelated methods (split classes)
- variables unused in most methods
- source files without unit test files
- low test methods vs. class methods
- test lines of code vs. real lines
- low asserts per test method
- lots of ifs or switch in a function
- global variables, lots of local vars
- duplicated lines-within edit dist. of 10
- duplicated obscure sys. Calls
- lapses in coding standards
- large numbers of methods in a class
- returning constants (return codes) vs. exceptions
- excessive non-javadoc comments
- too short identifier names (i.e. "x")
- c vs. cpp files
- use of structs and pointers
- hard-coded values and strings - i18n
- gotos
- type casting
- pointer arithmetic
- public, private interface enforcement
- accidental link-time polymorphism
- #ifdef, defines (excessive compiler directories)
- files with same name in diff. places
- system (OS) dependencies
- files with no depend
- compile - cycle time
- IDEA: reward removal of code
